Nucleic acids, vectors, and host cells including mycobacterium derived sequences

ABSTRACT

The invention relates to nucleic acids which contain particularly a nucleotide sequence extending from the extremity constituted by the nucleotide at position (1) to the extremity constituted by the nucleotide at position (1211) represented on the figure, to the polypeptides coded by said nucleic acids. The polypeptides of the invention can be used for the diagnosis of tuberculosis, and can also be part of the active principle in the preparation of a vaccine against tuberculosis.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates to polypeptides and peptides, particularlyrecombinant polypeptides and peptides, which can be used for thediagnosis of tuberculosis. The invention also relates to a process forpreparing the above-said polypeptides and peptides, which are in a stateof biological purity such that they can be used as part of the activeprinciple in the preparation of vaccines against tuberculosis.

It also relates to nucleic acids coding for said polypeptides andpeptides.

Furthermore, the invention relates to the in vitro diagnostic methodsand kits using the above-said polypeptides and peptides and to thevaccines containing the above-said polypeptides and peptides as activeprinciple against tuberculosis.

By "recombinant polypeptides or peptides" it is to be understood that itrelates to any molecule having a polypeptidic chain liable to beproduced by genetic engineering, through transcription and translation,of a corresponding DNA sequence under the control of appropriateregulation elements within an efficient cellular host. Consequently, theexpression "recombinant polypeptides" such as is used herein does notexclude the possibility for the polypeptides to comprise other groups,such as glycosylated groups.

The term "recombinant" indeed involves the fact that the polypeptide hasbeen produced by genetic engineering, particularly because it resultsfrom the expression in a cellular host of the corresponding nucleic acidsequences which have previously been introduced into the expressionvector used in said host.

Nevertheless, it must be understood that this expression does notexclude the possibility for the polypeptide to be produced by adifferent process, for instance by classical chemical synthesisaccording to methods used in the protein synthesis or by proteolyticcleavage of larger molecules.

The expression "biologically pure" or "biological purity" means on theone hand a grade of purity such that the recombinant polypeptide can beused for the production of vaccinating compositions and on the otherhand the absence of contaminants, more particularly of naturalcontaminants.

2. Description of the Prior Art

Tuberculosis remains a major disease in developing countries. Thesituation is dramatic in some countries, particularly where highincidence of tuberculosis among AIDS patients represents a new source ofdissemination of the disease.

Tuberculosis is a chronic infectious disease in which cell-mediatedimmune mechanisms play an essential role both for protection against andcontrol of the disease.

Despite BCG vaccination, and some effective drugs, tuberculosis remainsa major global problem. Skin testing with tuberculin PPD(protein-purified derivative) largely used for screening of the diseaseis poorly specific, due to cross reactivity with other pathogenic orenvironmental saprophytic mycobacteria.

Moreover, tuberculin PPD when used in serological tests (ELISA) does notpermit discrimination between patients who have been vaccinated by BCG,or those who have been primo-infected, from those who are developingevolutive tuberculosis and for whom an early and rapid diagnosis wouldbe necessary.

A protein with a molecular weight of 32-kDa has already been purifiedfrom zinc deficient M. bovis BCG culture filtrate. This protein wasidentified as antigen 85A (De Bruyn J. et al., 1987, "Purification,partial characterization and identification of a 32-kDa protein antigenof Mycobacterium bovis BCG" Microb. Pathogen. 2:351). Its NH₂ -terminalamino acid sequence (Phe-Ser-Arg-Pro-Gly-Leu) (SEQ ID NO:1) is identicalto that reported for the α-antigen (antigen 85B) protein purified fromM. bovis BCG (Wiker, H. G. et al., 1986, "MPB59, a widely cross-reactingprotein of Mycobacterium bovis BCG" Int. Arch. Allergy Appl. Immunol.81:307). The antigen 85-complex is present among different strains ofmycobacteria (De Bruyn J. et al., 1989, "Effect of zinc deficiency ofthe appearance of two immunodominant protein antigens (32-kDa and65-kDa) in culture filtrates of Mycobacteria" J. Gen Microbiol. 135:79).It is secreted by living bacilli as a predominant protein in normalSauton culture filtrate and could be useful in the serodiagnosis oftuberculosis (Turneer M. et al., 1988, "Humoral immune response in humantuberculosis: immunoglobulins G, A and M directed against the purifiedP32 protein antigen of Mycobacterium bovis bacillus Calmette-Guerin" J.Clin. Microbiol. 26:1714) and leprosy (Rumschlag H. S. et al., 1988,"Serological response of patients with lepromatous and tuberculosisleprosy to 30-, 31- and 32-kilodalton antigens of Mycobacteriumtuberculosis" J. Clin. Microbiol. 26:2200). Furthermore, the 32-kDaprotein-induced specific lymphoproliferation and interferon-γ(IFN-γ)production in peripheral blood leucocytes from tuberculosis (Huygen K.et al., 1988, "Specific lymphoproliferation, γ-interferon production andserum immunoglobulin G directed against a purified 32-kDa mycobacterialantigen (P32) in patients with active tuberculosis" Scand. J. Immunol.27:187), and leprosy patients and from PPD- and lepromin-positivehealthy subjects. Recent findings indicate that the amount of 32 kDaprotein induced IFN-γ in BCG-sensitized mouse spleen cells is underprobable H-2 control (Huygen K. et al, 1989, "H-2-linked control of invitro γ interferon production in response to a 32-kilodalton antigen(P32) of Mycobacterium bovis bacillus Calmette-Guerin" Infect. Imm.56:3196). Finally, the high affinity of mycobacteria for fibronectin isrelated to proteins of the antigen 85-complex (Abou-Zeid C. et al.,1988, "Characterization of fibronectin-binding antigens released byMycobacterium tuberculosis and Mycobacterium bovis BCG" Infect. Imm.56:3046).

Wiker et al. (Wiker H. G. et al., 1990, "Evidence for three separategenes encoding the proteins of the mycobacterial antigen 85 complex"Infect. Immun. 58:272) showed recently that the antigens 85A, B and Cisolated from M. bovis BCG culture filtrate present a few amino acidreplacements in their NH₂ terminal region strongly suggesting theexistence of multiple genes coding for these proteins. But, the datagiven for the antigen 85C of M. bovis BCG are insufficient to enable itsunambiguous identification as well as the characterization of itsstructural and functional elements.

The gene encoding the 85A antigen from Mycobacterium tuberculosis hasbeen described (Borremans L. et al., 1989, "Cloning, sequencedetermination and expression of a 32-kilodalton protein gene ofMycobacterium tuberculosis" Infect. Immun. 57:3123) which presented77.5% homology at the DNA level within the coding region with theα-antigen gene (85B gene of M. bovis BCG, substrain Tokyo) (Matsuo K. etal., 1988, "Cloning and expression of the Mycobacterium bovis BCG genefor extracellular α-antigen" J. Bacteriol. 170:3847). Moreover, recentlya corresponding 32-kDa protein genomic clone from a λgt11 BCG library(prepared from strain M. bovis BCG 1173P2) was isolated and sequenced.The complete sequence of this gene is identical with that from the 85Agene of Mycobacterium tuberculosis except for a single silent nucleotidechange (De Wit L. et al., 1990, "Nucleotide sequence of the 32kda-protein gene (antigen 85A) of Mycobacterium bovis BCG" Nucl. Ac.Res. 18:3995). Thus, it was likely, but not demonstrated, that thegenome of M. bovis BCG contained at least two genes coding for antigen85A and 85B respectively. As to the genome of the Mycobacteriumtuberculosis and M. bovis, nothing was proved as to the existence of newgenes, besides the genes coding respectively for 85A and 85B.

SUMMARY OF THE INVENTION

An aspect of the invention is to provide a new family of nucleic acidscoding for new proteins and polypeptides which can be used for thedetection and control of tuberculosis.

Another aspect of the invention is to provide nucleic acids coding forthe peptidic chains of biologically pure recombinant polypeptides whichenable their preparation on a large scale.

Another aspect of the invention is to provide antigens which can be used

in serological tests as an in vitro rapid diagnostic test fortuberculosis or in skin test,

or as immunogenic principle of a vaccine.

Another aspect of the invention is to provide a rapid in vitrodiagnostic means for tuberculosis, enabling it to discriminate betweenpatients suffering from an evolutive tuberculosis from those who havebeen vaccinated against BCG or who have been primo-infected.

Another aspect of the invention is to provide nucleic probes which canbe used as in vitro diagnostic reagents for tuberculosis as well as invitro diagnostic reagents for identifying M. tuberculosis from otherstrains of mycobacteria.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 represents the nucleotide (SEQ ID NO:2) and amino acid sequenceof (SEQ ID NO:3) the 85C antigen containing region of M. tuberculosis.

The previously identified 28-residue NH₂ terminal amino acid sequence ofthe mature protein is underlined with a double line. One additional ATGcodon, downstream from the ATG at position 150 is underlined. Since theprecise length of the signal sequence could not be determined, theoption taken here represents the 46 amino acid signal peptidecorresponding to ATG₁₅₀. The putative signal peptide is represented initalic capitals. The top drawing represents the sequencing strategy.Arrows indicate the direction of dideoxy-sequencing either in DNAsubcloned as double-stranded DNA in Blue Scribe M13+ or as singlestranded DNA in the mp18 M13 vector. The entire sequence was determinedusing synthetic oligonucleotides represented as gray boxes in thefigure.

FIG. 2 represents the homology between known nucleotide and amino acidsequence of the antigen 85 and the 85C antigen of M. tuberculosis.

A-Comparison of the DNA sequences of antigen 85A, B, and C:

DNA sequences have been aligned with the "Align" program whichvisualizes multiple alignments. In this presentation, sequencedifferences are outlined:

(•) indicate identical residues; (-) indicates a gap; (any letter)indicates a substitution.

All sequences are compared and aligned to the first line (gene 85A).

85A-TUB: DNA sequence from M. tuberculosis (Borremans L. et al., 1989,"Cloning, sequence determination and expression of a 32-kilodaltonprotein gene of Mycobacterium tuberculosis", Infect. Immun. 57:3123).

85B-BCG: DNA sequence from α-antigen of Mycobacterium bovis (strainTokyo) (Matsuo K. et al., 1988, "Cloning and expression of theMycobacterium bovis BCG gene for extracellular α-antigen", J. Bacteriol.170:3847).

85C-TUB: DNA sequence from antigen 85C from Mycobacterium tuberculosis(the present invention).

85B-KAN: DNA sequence from antigen 85B from M. kansasii (Matsuo K. etal., 1990, "Cloning and expression of the gene for cross-reactive αantigen of Mycobacterium kansasii", Infect. Immun. 58:550).

85C-BCG: Partial DNA sequence from Mycobacterium bovis BCG strain 1173P2(the present invention). This sequence was obtained from a cloned PCRamplified DNA fragment.

indicates the presumed initiation codon for each gene.

(↓) indicates the first phenylalanine residue of the mature protein.

indicates the termination codon of each gene.

P78 and P79 are sense and antisense primers used for PCR amplification.

85A, -B, -C sequences used for the synthesis of specific syntheticoligonucleotide probes are framed.

The indicated restriction sites have been used to prepare the threetype-specific probes (see also FIG. 4A).

B- Comparison of the pre-protein sequences of antigen 85A, B and C:

DNA sequences have been aligned with the "Align" program whichvisualizes multiple alignments. In this presentation, sequencedifferences are outlined:

(•) indicate identical residues; (-) indicates a gap; (any letter)indicates a substitution.

All sequences are compared and aligned to the first line (gene 85A).

85A: Protein sequence from M. tuberculosis (Borremans L. et al., 1989,"Cloning, sequence determination and expression of a 32-kilodaltonprotein gene of Mycobacterium tuberculosis", Infect. Immun. 57:3123).

85B: Protein sequence from α-antigen of Mycobacterium bovis (strainTokyo) (Matsuo K. et al., 1988, "Cloning and expression of theMycobacterium bovis BCG gene for extracellular α-antigen", J. Bacteriol.170:3847).

85C: Protein Sequence from antigen 85C from Mycobacterium tuberculosis(the present invention).

85B-KAN: Partial protein sequence from antigen 85B from M. kansasii(Matsuo K. et al., 1990, "Cloning and expression of the gene forcross-reactive α antigen of Mycobacterium kansasii", Infect. Immun.58:550).

85C-BCG: Partial protein sequence from Mycobacterium bovis BCG strain1173P2 (the present invention).

The "C" characteristic motif is framed.

FIG. 3 represents the hydropathy pattern of the M. tuberculosis 32-kDa(antigen 85A), the α-antigen of BCG (antigen 85B) and antigen 85C fromM. tuberculosis, amino acid sequences:

The sequence of the three pre-proteins (including the presumed signalpeptide signals) have been analyzed using the Kyte and Doolittle method(Borremans L. et al., 1989, "Cloning, sequence determination andexpression of a 32-kilodalton protein gene of Mycobacteriumtuberculosis", Infect. Immun. 57:3123) with a window of eight aminoacids. Each bar on the axes represents 50 amino acids. Since the lengthof signal sequences are slightly different (43, 40 and 46 residues forthe three proteins 85A, 85B, 85C) the patterns are aligned to the firstresidue of the three mature proteins. Plain lines are used to alignhydrophobic peaks and a dashed line to align hydrophilic peaks.

FIGS. 4A represents the restriction endonuclease maps of the three genes85A, 85B and 85C: type-specific probes are marked by <->.

The map of gene 85A is derived from Borr et al. (Borremans L. et al.,1989, "Cloning, sequence determination and expression of a 32-kilodaltonprotein gene of Mycobacterium tuberculosis", Infect. Immun. 57:3123).The map of 85B was obtained from clone 5.1 derived for our Mycobacteriumbovis BCG 1173P2 λgt11 recombinant library (De Wit L. et al., 1990,"Nucleotide sequence of the 32 kDa-protein gene (antigen 85A) ofMycobacterium bovis BCG", Nuc. Ac. Res. 18:3995). For the restrictionenzymes used, this map is identical to that published for M. bovis BCG(strain Tokyo) (Matsuo K. et al., 1988, "Cloning and expression of theMycobacterium bovis BCG gene for extracellular α antigen", J. Bacteriol.170:3847). The coding region of the 85B antigen is positioned accordingto Matsuo et al. (Matsuo et al., 1988, "Cloning and expression of theMycobacterium bovis BCG gene for extracellular α-antigen", J. Bacteriol.170:3847).

The map of 85C corresponds to the restriction map of clone 11.2 that wasobtained from the M. tuberculosis λgt11 library from R. Young (Young, R.A. et al., 1985, "Dissection of Mycobacterium tuberculosis antigensusing recombinant DNA", Proc. Natl. Acad. Sci. USA 82:2583) (Materialsand Methods). The position of the specific 5' DNA restriction fragmentused for Southern analysis is indicated in each map by a double arrow.

FIG. 4B represents the Southern analysis of the total genomic DNA fromMycobacterium bovis BCG (strain 1173P2).

Fifteen μg of digested DNA was applied per lane. Hybridization was witholigonucleotide probes A, B, C (as described in FIG. 2A) under theconditions described in Materials and Methods. Molecular weight of thehybridizing bands were calculated by comparison with standards.

FIG. 4C represents the Southern analysis of total genomic DNA from M.bovis BCG 1173P2. The procedure described for FIG. 4B was used.

The three probes, however, were large DNA restriction fragments (asdefined in FIG. 4A).

Parts 85A and 85B were obtained from a single filter, whereas 85C wasfrom a separate run.

FIG. 5 represents the pulse field electrophoresis of Mycobacteriumtuberculosis DNA.

DNA from three strains of Mycobacterium tuberculosis was digested withDral and separated by Pulso field electrophoresis on an agarose geltogether with a bacteriophage λ DNA "ladder" as described in Materialsand Methods. After transfer to nylon filters, hybridization with thethree probes 85A, 85B, 85C was as described under FIG. 4A. Molecularweights of the hybridizing bands were calculated by comparison withthose of the λ DNA "ladder".

The nucleic acids of the invention

contain a nucleotide sequence extending from the extremity constitutedby the nucleotide at position (1) to the extremity constituted by thenucleotide at position (149) represented on FIG. 1 (SEQ ID NO:2),

or contain one at least of the nucleotide sequences coding for thefollowing peptides or polypeptides:

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1 (SEQ ID NO:3), or

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1 (SEQ ID NO:3), or

SQSNGQNY (SEQ ID NO:4), or

PMVQIPRLVA (SEQ ID NO:5), or

GLTLRTNQTFRDTYAADGGRNG (SEQ ID NO:6), or

PPAAPAAPAA (SEQ ID NO:7),

or contain nucleotidic sequences:

hybridizing with the above-mentioned nucleotide sequences, or theircomplements,

complementary to the above-mentioned nucleotide sequences, or

which are the above-mentioned nucleotide sequences wherein T can bereplaced by U,

or are constituted by the above-mentioned nucleotide sequences.

SQSNGQNY (SEQ ID NO:4) is a sequence corresponding to the one extendingfrom position 84 to position 91 of 85C sequence represented on FIG. 1.

PMVQIPRLVA (SEQ ID NO:5) is a sequence corresponding to the oneextending from position 191 to position 200 of 85C sequence representedon FIG. 1.

GLTLRTNQTFRDTYAADGGRNG (SEQ ID NO:7) is a sequence corresponding to theone extending from position 229 to position 250 of 85C sequencerepresented on FIG. 1.

PPAAPAAPAA (SEQ ID NO:7) is a sequence corresponding to the oneextending from position 285 to position 294 of 85C sequence representedon FIG. 1.

The hybridization takes place under the following conditions:

hybridization and wash medium:

a preferred hybridization medium contains about 3×SSC SSC=0.15M sodiumchloride, 0.015M sodium citrate, pH 7! about 25 mM of phosphate bufferpH 7.1, and 20% deionized formamide, 0.02% Ficoll, 0.02% BSA, 0.02%polyvinylpyrrolidone and about 0.1 mg/ml sheared denatured salmon spermDNA,

a preferred wash medium contains about 3×SSC, about 25 mM phosphatebuffer, pH 7.1 and 20% deionized formamide;

hybridization temperature (HT) and wash temperature (WT) are between 45°C. and 65° C.;

for the nucleotide sequence extending from the extremity constituted bythe nucleotide at position (1) to the extremity constituted by thenucleotide at position (149) represented on FIG. 1: HT=WT=65° C. for thenucleic acids of the invention defined by coded polypeptides X-Y: i.e.

the sequence extending from the extremity constituted by the amino acidat position (X) to the extremity constituted by the amino acid atposition (Y) represented on FIG. 1,

the sequence extending from the extremity constituted by the amino acidat position (-46) to the extremity constituted by the amino acid atposition (-1) represented on FIG. 1, HT=WT=65° C.

the sequence extending from the extremity constituted by the amino acidat position (-21) to the extremity constituted by the amino acid atposition (-1) represented on FIG. 1, HT=WT=60° C. for the nucleic acidsdefined by coded polypeptides represented by their sequence:

SQSNGQNY (SEQ ID NO:4) HT=WT=45° C.

PNVQIPRLVA (SEQ ID NO:5) HT=WT=55° C.

GLTLRTNQTFRDTYAADGGRNG (SEQ ID NO:6) HT=WT=65° C.

PPAAPAAPAA (SEQ ID NO:7) HT=WT=65° C.

The above-mentioned temperatures are to be expressed as approximately±5° C.

Advantageous nucleic acids of the invention contain at least one of thefollowing nucleotide sequences:

the one extending from the extremity constituted by the nucleotide atposition (150) to the extremity constituted by the nucleotide atposition (287) on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (224) to the extremity constituted by the nucleotide atposition (287) on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (537) to the extremity constituted by the nucleotide atposition (560) on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (858) to the extremity constituted by the nucleotide atposition (887) on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (972) to the extremity constituted by the nucleotide atposition (1037) on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (1140) to the extremity constituted by the nucleotide atposition (1169) on FIG. 1, or contain nucleotidic sequences:

hybridizing with the above-mentioned nucleotide sequences, or

complementary to the above-mentioned nucleotide sequences, or

which are the above-mentioned nucleotide sequences wherein T can bereplaced by U, or are constituted by the above-mentioned nucleotidesequences.

The hybridization takes place under the following conditions:

hybridization and wash medium are as defined above;

hybridization temperature (HT) and wash temperature (WT) for the nucleicacids of the invention defined by X-Y: i.e. by the sequence extendingfrom the extremity constituted by the nucleotide at position (X) to theextremity constituted by the nucleotide at position (Y) represented onFIG. 1:

(150)-(287) HT=WT=65° C.

(224)-(287) HT=WT=60° C.

(537)-(560) HT=WT=45° C.

(858)-(887) HT=WT=55° C.

(972)-(1037) HT=WT=65° C.

(1140)-(1169) HT=WT=65° C.

An advantageous group of nucleic acids of the invention contains thenucleotide sequence coding for the following peptide:

SQSNGQNY (SEQ ID NO:4)

and possibly containing the nucleotide sequence coding for the followingpeptide:

FSRPGLPVEYLQVP (SEQ ID NO:8)

and liable to hybridize with the following nucleotide sequence:

CGGCTGGGAC(or T)ATCAACACCCCGGC (SEQ ID NO:9)

and liable to hybridize neither with

GCCTGCGGCAAGGCCGGTTGCCAG (SEQ ID NO:10)

nor with

GCCTGCGGTAAGGCTGGCTGCCAG (SEQ ID NO:11)

nor with

GCCTGCGGCAAGGCCGGCTGCACG (SEQ ID NO:12)

or are constituted by the above-mentioned hybridizing nucleotidesequences.

The above-mentioned hybridization can take place when the hybridizationand wash medium is as indicated above; and the hybridization and washtemperature is 52° C.

The expression "not liable to hybridize with" means that the nucleicacid molecule of the invention does not contain a stretch of nucleotidehybridizing at 52° C. in the above defined medium with the three probesdefined above.

Advantageous nucleic acids of the invention contain one at least of theabove-mentioned nucleotide sequences or are constituted by theabove-mentioned nucleotide sequences and besides contain an open readingframe coding for a polypeptide:

liable to react selectively with human sera from tuberculosis patientsand particularly patients developing an evolutive tuberculosis,

or liable to be recognized by antibodies also recognizing the amino acidsequence extending from the extremity constituted by amino acid atposition (1) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

or liable to generate antibodies recognizing the amino acid sequenceextending from the extremity constituted by amino acid at position (1)to the extremity constituted by amino acid at position (294) representedon FIG. 1.

The recognition of the above-mentioned sequence of the 294 amino acids(or of the polypeptides of the invention) by the above said antibodiesmeans that the above-said sequence forms a complex with one of theabove-mentioned antibodies.

Forming a complex between the antigen (i.e. the sequence of 294 aminoacids or any polypeptide of the invention) and the antibodies anddetecting the existence of a formed complex can be done according toclassical techniques (such as the one using a tracer labeled withradioactive isotopes or an enzyme).

Hereafter is given, in a non-limitative way, a process for testing theselective reaction between the antigen and human sera from tuberculosispatients and particularly patients developing an evolutive tuberculosis.

This test is an immunoblotting (Western blotting) analysis, in the casewhere the polypeptides of the invention are obtained by recombinanttechniques. This test can also be used for polypeptides of the inventionobtained by a different preparation process. After sodium dodecylsulfate--polyacrylamide gel electrophoresis, polypeptides of theinvention are blotted onto nitrocellulose membranes (Hybond C.(Amersham)) as described by Towbin H. et al., 1979, "Electrophoretictransfer of proteins from polyacrylamide gels to nitrocellulose sheets:procedure and some applications" Proc. Natl. Acad. Sci. USA76:4350-4354. The expression of polypeptides of the invention fused toβ-galactosidase in E. coli Y1089, is visualized by the binding of apolyclonal rabbit anti-antigen 85 serum (1:1,000) or by using amonoclonal anti-β-galactosidase antibody (Promega). The secondaryantibody (alkaline phosphatase anti-rabbit immunoglobulin G andanti-mouse alkaline phosphatase immunoglobulin G conjugates,respectively) is diluted as recommended by the supplier (Promega).

In order to identify selective recognition of polypeptides of theinvention and of fusion proteins of the invention by human tuberculoussera, nitrocellulose sheets are incubated overnight with these sera(1:50) (after blocking aspecific protein-binding sites). Reactive areason the nitrocellulose sheets are revealed by incubation withperoxidase-conjugated goat anti-human immunoglobulin G antibody(Dakopatts, Copenhagen, Denmark) (1:200) for 4 h. After repeatedwashings, color reaction is developed by adding peroxidase substrate(α-chloronaphtol)(Bio-Rad Laboratories, Richmond, Calif.) in thepresence of peroxidase and hydrogen peroxide.

Advantageous nucleic acids of the invention contain or are constitutedby one of the above-mentioned nucleotide sequences, contain an openreading frame and code for a mature polypeptide of about 30 to about 35kD, and contain a sequence coding for a signal sequence.

Advantageous nucleic acids of the invention contain one at least of thenucleotide sequences coding for the following polypeptides:

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1, or

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1, or

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(294) represented on FIG. 1, or

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(294) represented on FIG. 1, or

the one extending from the extremity constituted by amino acid atposition (1) to the extremity constituted by amino acid at position(294) represented on FIG. 1, or contain nucleotidic sequences:

hybridizing with the above-mentioned nucleotide sequences, or

complementary to the above-mentioned nucleotide sequences, or

which are the above-mentioned nucleotide sequences wherein T can bereplaced by U, or are constituted by the above-mentioned nucleotidesequences.

The hybridization takes place under the following conditions:

hybridization and wash medium are as above defined;

hybridization temperature (HT) and wash temperature (WT) for the nucleicacids of the invention defined by coded polypeptides X-Y: i.e. by thecoded sequence extending from the extremity constituted by the aminoacid at position (X) to the extremity constituted by the amino acid atposition (Y) represented on FIG. 1:

(-46)-(-1) HT=WT=65° C.

(-21)-(-1) HT=WT=60° C.

(-46)-(294) HT=WT=70° C.

(-21)-(294) HT=WT=70° C.

(1)-(294) HT=WT=70° C.

Advantageous nucleic acids of the invention contain one at least of thefollowing nucleotide sequences:

the one extending from the extremity constituted by the nucleotide atposition (150) to the extremity constituted by the nucleotide atposition (287) represented on FIG. 1, or

the one extending from the extremity constituted by the nucleotide atposition (224) to the extremity constituted by the nucleotide atposition (287) represented on FIG. 1, or

the one extending from the extremity constituted by the nucleotide atposition (1) to the extremity constituted by the nucleotide at position(1169) represented on FIG. 1, or

the one extending from the extremity constituted by the nucleotide atposition (150) to the extremity constituted by the nucleotide atposition (1169) represented on FIG. 1, or

the one extending from the extremity constituted by the nucleotide atposition (224) to the extremity constituted by the nucleotide atposition (1169) represented on FIG. 1, or

the one extending from the extremity constituted by the nucleotide atposition (288) to the extremity constituted by the nucleotide atposition (1169) represented on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (1) to the extremity constituted by the nucleotide at position(1211) represented on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (150) to the extremity constituted by the nucleotide atposition (1211) represented on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (224) to the extremity constituted by the nucleotide atposition (1211) represented on FIG. 1,

the one extending from the extremity constituted by the nucleotide atposition (288) to the extremity constituted by the nucleotide atposition (1211) represented on FIG. 1, or contain nucleotidic sequences:

hybridizing with the above-mentioned nucleotide sequences, or

complementary to the above-mentioned nucleotide sequences, or

which are the above-mentioned nucleotide sequences wherein T can bereplaced by U, or are constituted by one at least of the followingnucleotide sequences.

The hybridization takes place under the following conditions:

hybridization and wash medium are as above defined;

hybridization temperature (HT) and wash temperature (WT) for the nucleicacids of the invention defined for the nucleic acids of the inventiondefined by X-Y: i.e. by the sequence extending from the extremityconstituted by the nucleotide at position (X) to the extremityconstituted by the nucleotide at position (Y) represented on FIG. 1:

(150)-(287) HT=WT=65° C.

(224)-(287) HT=WT=60° C.

(150)-(1169) HT=WT=70° C.

(1)-(1169) HT=WT=70° C.

(224)-(1169) HT=WT=70° C.

(288)-(1169) HT=WT=70° C.

The invention relates also to the polypeptides coded by the nucleicacids of the invention above defined.

Advantageous polypeptides of the invention contain at least one of thefollowing amino acid sequences in their polypeptide chain:

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1,

or the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1, or

SQSNGQNY (SEQ ID NO:4), or

PMVQIPRLVA (SEQ ID NO:5), or

GLTLRTNQTFRDTYAADGGRNG (SEQ ID NO:6), or

PPAAPAAPAA (SEQ ID NO:7), or are constituted by the above-mentionedpolypeptide sequences.

The invention also relates to polypeptides containing, in theirpolypeptide chain, the following amino acid sequence:

SQSNGQNY (SEQ ID NO:4)

and possibly the amino acid sequence

GWDINTPA (SEQ ID NO:13)

and possibly the amino acid sequence

FSRPGLPVEYLQVP (SEQ ID NO:8)

and containing not the amino acid sequence

ACGKAGCQ (SEQ ID NO:14)

and not the amino acid sequence

ACGKAGCT (SEQ ID NO:15)

Advantageous polypeptides of the invention contain in their polypeptidechain the following amino acid sequences:

SQSNGQNY (SEQ ID NO:4)

GWDINTPA (SEQ ID NO:13)

FSRPGLPVEYLQVP (SEQ ID NO:8)

and one at least of the following amino acid sequences:

PMVQIPRLVA (SEQ ID NO:5),

GLTIRTNQTFRDTYAADGGRNG (SEQ ID NO:6),

PPAAPAAPAA (SEQ ID NO:7),

and containing not the amino acid sequence

ACGKAGCQ (SEQ ID NO:14)

and not the amino acid sequence

ACGKAGCT (SEQ ID NO:15).

The following polypeptides are new:

SQSNGQNY (SEQ ID NO:4),

PMVQIPRLVA (SEQ ID NO:5),

GLTLRTNQTFRDTYAADGGRNG (SEQ ID NO:6),

PPAAPAAPAA (SEQ ID NO:7).

Advantageous polypeptides of the invention are liable to reactselectively with human sera from tuberculosis patients and particularlypatients developing an evolutive tuberculosis, or liable to berecognized by antibodies also recognizing the polypeptide sequenceextending from the extremity constituted by amino acid at position (1)to the extremity constituted by amino acid at position (294) representedon FIG. 1, or liable to generate antibodies recognizing the polypeptidicsequence extending from the extremity constituted by amino acid atposition (1) to the extremity constituted by amino acid at position(294) represented on FIG. 1.

The invention also includes the peptidic sequences resulting from themodification by substitution and/or by addition and/or by deletion ofone or several amino acids in the above defined polypeptides andpeptides in so far as this modification does not alter the followingproperties:

selective reaction with human sera from tuberculosis patients andparticularly patients developing an evolutive tuberculosis,

and/or reaction with antibodies raised against the amino acid sequenceextending from the extremity constituted by amino acid at position (1),to the extremity constituted by amino acid at position (294) representedon FIG. 1.

Advantageous polypeptides of the invention contain or are constituted byone of the above-mentioned polypeptide sequences, and are about 30 toabout 35 kD and are preceded by a signal peptide.

Advantageous polypeptides of the invention contain in their polypeptidechain, one at least of the following amino acid sequences or areconstituted by one of the following amino acid sequences:

the one extending from the extremity constituted by amino acid atposition (1) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1.

It goes without saying that the free reactive functions which arepresent in some of the amino acids, which are part of the constitutionof the polypeptides of the invention, particularly the free carboxylgroups which are carried by the groups Glu or Asp or by the C-terminalamino acid on the one hand and/or the free NH₂ groups carried by theN-terminal amino acid or by amino acid inside the peptidic chain, forinstance Lys, on the other hand, can be modified insofar as thismodification does not alter the above-mentioned properties of thepolypeptide.

The molecules which are thus modified are naturally part of theinvention. The above-mentioned carboxyl groups can be acylated oresterified.

Other modifications are also part of the invention. Particularly, theamine or ester functions or both of terminal amino acids can bethemselves involved in the bond with other amino acids. For instance,the N-terminal amino acid can be linked to a sequence comprising from 1to several amino acids corresponding to a part of the C-terminal regionof another peptide.

The polypeptides according to the invention can be glycosylated or not,particularly in some of their glycosylation sites of the type Asn-X-Seror Asn-X-Thr, X representing any amino acid.

Other advantageous polypeptides of the invention consist in one of thefollowing amino acid sequences:

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1,

or the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1.

These polypeptides can be used as signal peptides, the role of which isto initiate the translocation of a protein from its site of synthesis tothe membrane and which is excised during translocation.

Advantageous polypeptides of the invention are the ones constituted by:

SQSNGQNY (SEQ ID NO:4),

PMVQIPRLVA (SEQ ID NO:5),

GLTRTNQTFRDTYAADGGRNG (SEQ ID NO:6),

PPAAPAAPAA (SEQ ID NO:7),

the one extending from the extremity constituted by amino acid atposition (1) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(294) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-46) to the extremity constituted by amino acid at position(-1) represented on FIG. 1,

the one extending from the extremity constituted by amino acid atposition (-21) to the extremity constituted by amino acid at position(-1) represented on FIG. 1.

All these polypeptides are new.

Other interesting polypeptides, which are common to the already knownsequences of antigens 85A, 85B and 85C of M. tuberculosis, M. bovis andM. kansasii are (see FIG. 2A)

GWDINTPA (SEQ ID NO:13),

and

FSRPGLPVEYLQVP (SEQ ID NO:8).

It is to be noted that the above-mentioned polypeptides are derived fromthe expression products of a DNA derived, as explained hereafter in theexamples,

from the nucleotide sequence coding for a protein of 33-kDa secreted byMycobacterium tuberculosis or

from the partial nucleotide sequence coding for a protein of 33-kDasecreted by M. bovis BCG, or

from related nucleotide sequences which will be hereafter designated by85C genes.

The invention also relates to the amino acid sequences constituted bythe above-mentioned polypeptides and a protein or an heterologoussequence with respect to said polypeptide, said protein or heterologoussequence comprising for instance from about 1 to about 1000 amino acids.These amino acid sequences will be called fusion proteins.

In an advantageous fusion protein of the invention, the heterologousprotein is β-galactosidase.

The invention also relates to any recombinant nucleic acids containingat least one of the nucleic acids of the invention inserted in aheterologous nucleic acid.

The invention relates more particularly to recombinant nucleic acid suchas defined, in which the nucleotide sequence of the invention ispreceded by a promoter (particularly an inducible promoter) under thecontrol of which the transcription of said sequence is liable to beprocessed and possibly followed by a sequence coding for transcriptiontermination signals.

The invention also relates to the recombinant nucleic acids in which thenucleic acid sequences coding for the polypeptide of the invention andpossibly the signal peptide, are recombined with control elements whichare heterologous with respect to the ones to which they are normallyassociated with in the mycobacterial genome, more particularly, theregulation elements adapted to control their expression in the cellularhost which has been chosen for their production.

The invention also relates to recombinant vectors, particularly forcloning and/or expression, comprising a vector sequence, notably of thetype plasmid, cosmid or phage DNA or virus DNA, and a recombinantnucleic acid of the invention, in one of the non-essential sites for itsreplication.

According to an advantageous embodiment of the invention, therecombinant vector contains, in one of its non-essential sites for itsreplication, necessary elements to promote the expression ofpolypeptides according to the invention in a cellular host and notably apromoter recognized by the RNA polymerase of the cellular host,particularly an inducible promoter and possibly a sequence coding fortranscription termination signals and possibly a signal sequence and/oran anchor sequence.

According to another additional embodiment of the invention, therecombinant vector contains the elements enabling the expression by E.coli of a nucleic acid according to the invention inserted in thevector, and particularly the elements enabling the expression of thegene or part thereof of β-galactosidase.

The invention also relates to a cellular host which is transformed by arecombinant vector according to the invention, and containing theregulation elements enabling the expression of the nucleotide sequencecoding for the polypeptide according to the invention in this host.

The invention also relates to a cellular host chosen from among bacteriasuch as E. coli, transformed by a vector as defined above, or chosenfrom among eukaryotic organism, such as CHO cells or insect cells,transfected by a vector as above defined.

The invention relates to an expression product of a nucleic acidexpressed by a transformed cellular host according to the invention.

The invention also relates to the use of any secreted polypeptide of theinvention as a carrier antigen for foreign epitopes (epitopes of apolypeptide sequence heterologous with respect to the polypeptides ofthe invention) in the Mycobacterium bovis BCG vaccine strain.

The Mycobacterium bovis BCG vaccine strain used can be available fromInstitut Pasteur (Paris), under 1173P₂.

The recombinant DNA comprising the nucleic acid coding for anyone of thepolypeptides of the invention and the nucleic acid coding for anyforeign epitopes as defined above, can contain. the promoter sequence ofsaid polypeptide of the invention, the signal sequence of saidpolypeptide, possibly the coding part of said polypeptide and the codingnucleic acid of the foreign epitope, said nucleic acid of the foreignepitope being for instance

either directly located after the signal sequence, and if the codingpart of the the polypeptide of the invention is present, upstream fromthe coding part of the polypeptide of the invention,

or located downstream from the coding part of the polypeptide of theinvention,

or located within the coding part of the polypeptide of the invention.

The recombinant DNA as above defined can be transformed into the vaccinestrain BCG where it leads to the expression and secretion of arecombinant protein antigen.

From the nucleic acids of the invention, probes (i.e. cloned orsynthetic oligonucleotides) can be inferred.

These probes can be from 15 to the maximum number of nucleotides of theselected nucleic acids. The oligonucleotides can also be used either asamplification primers in the PCR technique (PCR, Mullis and Faloona,Methods in Enzymology, vol. 155, p. 335, 1987) to generate specificenzymatically amplified fragments and/or as probes to detect fragmentsamplified between bracketing oligonucleotide primers.

The specificity of a PCR-assisted hybridization assay can be controlledat different levels.

The amplification process or the detection process or both can bespecific. The latter case giving the higher specificity is preferred.

The invention also relates to a process for preparing a polypeptideaccording to the invention comprising the following steps:

the culture in an appropriate medium of a cellular host which haspreviously been transformed by an appropriate vector containing anucleic acid according to the invention,

the recovery of the polypeptide produced by the above said transformedcellular host from the above said culture, and

the purification of the polypeptide produced, eventually by means ofimmobilized metal ion affinity chromatography (IMAC).

The polypeptides of the invention can be prepared according to theclassical techniques in the field of peptide synthesis.

The synthesis can be carried out in homogeneous solution or in solidphase.

For instance, the synthesis technique in homogeneous solution which canbe used is the one described by Houbenweyl in the book entitled "Methodeder organischen chemie" (Method of organic chemistry) edited by E.Wunsh, vol. 15-I et II. THIEME, Stuttgart 1974.

The polypeptides of the invention can also be prepared in solid phaseaccording to the methods described by Atherton and Shepard in their bookentitled "Solid phase peptide synthesis" (IRL Press, Oxford, New York,Tokyo, 1989).

The invention also relates to a process for preparing the nucleic acidsaccording to the invention.

A suitable method for chemically preparing the single-stranded nucleicacids (containing at most 100 nucleotides of the invention) comprisesthe following steps:

DNA synthesis using the automatic β-cyanoethyl phosphoramidite methoddescribed in Bioorganic Chemistry 4; 274-325, 1986.

In the case of single-stranded DNA, the material which is obtained atthe end of the DNA synthesis can be used as such.

A suitable method for chemically preparing the double-stranded nucleicacids (containing at most 100 bp of the invention) comprises thefollowing steps:

DNA synthesis of one sense oligonucleotide using the automaticβ-cyanoethyl phosphoramidite method described in Bioorganic Chemistry 4;274-325, 1986, and DNA synthesis of one anti-sense oligonucleotide usingsaid above-mentioned automatic β-cyanoethyl phosphoramidite method,

combining the sense and anti-sense oligonucleotides by hybridization inorder to form a DNA duplex,

cloning the DNA duplex obtained into a suitable plasmid vector andrecovery of the DNA according to classical methods, such as restrictionenzyme digestion and agarose gel electrophoresis.

A method for the chemical preparation of nucleic acids of length greaterthan 100 nucleotides--or base pairs, in the case of double-strandednucleic acids comprises the following steps:

assembling of chemically synthesized oligonucleotides, provided at theirends with different restriction sites, the sequences of which arecompatible with the succession of amino acids in the natural peptide,according to the principle described in Proc. Nat. Acad. Sci. USA 80;7461-7465, 1983,

cloning the DNA thereby obtained into a suitable plasmid vector andrecovery of the desired nucleic acid according to classical methods,such as restriction enzyme digestion and agarose gel electrophoresis.

The invention also relates to antibodies themselves formed against thepolypeptides according to the invention.

It goes without saying that this production is not limited to polyclonalantibodies.

It also relates to any monoclonal antibody produced by any hybridomaliable to be formed according to classical methods from splenic cells ofan animal, particularly of a mouse or rat, immunized against thepurified polypeptide of the invention on the one hand, and of cells of amyeloma cell line on the other hand, and to be selected by its abilityto produce the monoclonal antibodies recognizing the polypeptide whichhas been initially used for the immunization of the animals.

The invention also relates to any antibody of the invention labeled byan appropriate label of the enzymatic, fluorescent or radioactive type.

The peptides which are advantageously used to produce antibodies,particularly monoclonal antibodies, are the following ones listed inTable 1 (referring to FIG. 1):

                                      TABLE 1    __________________________________________________________________________    38  H.sub.2 N-DGLRAQDDYNGWDINTPAFE-COOH                               57 (SEQ ID NO:16)    78  H.sub.2 N-TDWYQPSQSNGQNYTYKWET-COOH                               97 (SEQ ID NO:17)    174 H.sub.2 N-ANSMWGPSSDPAWKRNDPMV-COOH                              193 (SEQ ID NO:18)    204 H.sub.2 N-RIWVYCGNGTPSDLGGDNIP-COOH                              223 (SEQ ID NO:19)    235 H.sub.2 N-NQTFRDTYAADGGRNGVFNF-COOH                              254 (SEQ ID NO:20)    250 H.sub.2 N-GVFNFPPNGTHSWPYWNEQL-COOH                              269 (SEQ ID NO:21)    275 H.sub.2 N-DIQHVLNGATPPAAPAAPAA-COOH                              294 (SEQ ID NO:22)    __________________________________________________________________________

The amino acid sequences are given in the one-letter code.

Variations of the peptides listed in Table 1 are also possible dependingon their intended use. For example, if the peptides are to be used toraise antisera, the peptides may be synthesized with an extra cysteineresidue added. This extra cysteine residue is preferably added to theamino terminus and facilitates the coupling of the peptide to a carrierprotein which is necessary to render the small peptide immunogenic. Ifthe peptide is to be labeled for use in radioimmunoassays, it may beadvantageous to synthesize the protein with a tyrosine attached toeither the amino or carboxyl terminus to facilitate iodination. Thesepeptides therefore possess the primary sequence of the peptides listedin Table 1 but with additional amino acids which do not appear in theprimary sequence of the protein and whose sole function is to confer thedesired chemical properties to the peptides.

The invention also relates to any polypeptide according to the inventionlabeled by an appropriate label of the enzymatic, fluorescent,radioactive type.

The invention also relates to a process for detecting in vitroantibodies related to tuberculosis in a human biological sample liableto contain them, this process comprising

contacting the biological sample with a polypeptide or a peptideaccording to the invention under conditions enabling an in vitroimmunological reaction between said polypeptide and the antibodies whichare possibly present in the biological sample and

the in vitro detection of the antigen/antibody complex which may beformed.

Preferably, the biological medium is constituted by a human serum.

The detection can be carried out according to any classical process.

By way of example, a preferred method brings into play animmunoenzymatic process according to an ELISA, immunofluorescent, orradioimmunological (RIA) technique, or the equivalent ones.

Such a method for detecting in vitro antibodies related to tuberculosiscomprises for instance the following steps:

deposit of determined amounts of a polypeptidic composition according tothe invention in the wells of a titration microplate,

introduction into said wells of increasing dilutions of the serum to bediagnosed,

incubation of the microplate,

repeated rinsing of the microplate,

introduction into the wells of the microplate of labeled antibodiesagainst the blood immunoglobulins,

the labeling of these antibodies being based on the activity of anenzyme which is selected from among the ones which are able to hydrolyzea substrate by modifying the absorption of the radiation of this latterat least at a given wavelength,

detection by comparison with a control standard of the amount ofhydrolyzed substrate.

The invention also relates to a process for detecting and identifying invitro antigens of M. tuberculosis in a human biological sample liable tocontain them, this process comprising:

contacting the biological sample with an appropriate antibody of theinvention under conditions enabling an in vitro immunological reactionbetween said antibody and the antigens of M. tuberculosis which arepossibly present in the biological sample and the in vitro detection ofthe antigen/antibody complex which may be formed.

Preferably, the biological medium is constituted by sputum, pleuraleffusion liquid, broncho-alveolar washing liquid, urine, biopsy orautopsy material.

The invention also relates to an additional method for the in vitrodiagnosis of tuberculosis in a patient liable to be infected byMycobacterium tuberculosis comprising the following steps:

the possible previous amplification of the amount of the nucleotidesequences according to the invention, liable to be contained in abiological sample taken from said patient by means of a DNA primer setas defined above,

contacting the above-mentioned biological sample with a nucleotide probeof the invention, under conditions enabling the production of anhybridization complex formed between said probe and said nucleotidesequence,

detecting the above said hybridization complex which has possibly beenformed.

To carry out the in vitro diagnostic method for tuberculosis in apatient liable to be infected by Mycobacterium tuberculosis as definedabove, the following necessary or kit can be used, with said necessaryor kit comprising:

a determined amount of a nucleotide probe of the invention,

advantageously the appropriate medium for creating an hybridizationreaction between the sequence to be detected and the above mentionedprobe,

advantageously, reagents enabling the detection of the hybridizationcomplex which has been formed between the nucleotide sequence and theprobe during the hybridization reaction.

The invention also relates to an additional method for the in vitrodiagnosis of tuberculosis in a patient liable to be infected byMycobacterium tuberculosis comprising:

contacting a biological sample taken from a patient with a polypeptideor a peptide of the invention, under conditions enabling an in vitroimmunological reaction between said polypeptide or peptide and theantibodies which are possibly present in the biological sample and

the in vitro detection of the antigen/antibody complex which haspossibly been formed.

To carry out the in vitro diagnostic method for tuberculosis in apatient liable to be infected by Mycobacterium tuberculosis, thefollowing necessary or kit can be used, with said necessary or kitcomprising:

a polypeptide or a peptide according to the invention,

reagents for making a medium appropriate for the immunological reactionto occur,

reagents enabling to detect the antigen/antibody complex which has beenproduced by the immunological reaction, with said reagents possiblyhaving a label, or being liable to be recognized by a labeled reagent,more particularly in the case where the above-mentioned polypeptide orpeptide is not labeled.

The invention also relates to an additional method for the in vitrodiagnosis of tuberculosis in a patient liable to be infected by M.tuberculosis, comprising the following steps:

contacting the biological sample with an appropriate antibody of theinvention under conditions enabling an in vitro immunological reactionbetween said antibody and the antigens of M. tuberculosis which arepossibly present in the biological sample and the in vitro detection ofthe antigen/antibody complex which may be formed.

To carry out the in vitro diagnostic method for tuberculosis in apatient liable to be infected by Mycobacterium tuberculosis, thefollowing necessary or kit can be used, with said necessary or kitcomprising:

an antibody of the invention,

reagents for making a medium appropriate for the immunological reactionto occur,

reagents enabling the detection of the antigen/antibody complexes whichhave been produced by the immunological reaction, with said reagentpossibly having a label or being liable to be recognized by a labeledreagent, more particularly in the case where the above-mentionedantibody is not labeled.

An advantageous kit for the in vitro diagnosis of tuberculosiscomprises:

at least a suitable solid phase system, e.g. a microtiter-plate fordeposition thereon of the biological sample to be diagnosed in vitro,

a preparation containing one of the monoclonal antibodies of theinvention,

a specific detection system for said monoclonal antibody,

appropriate buffer solutions for carrying out the immunological reactionbetween the biological sample and said monoclonal antibody on the onehand, and the bonded monoclonal antibodies and the detection system onthe other hand.

The invention also relates to a kit, as described above, also containinga preparation of one of the polypeptides or peptides of the invention,with said antigen of the invention being either a standard (forquantitative determination of the antigen of M. tuberculosis which issought) or a competitor, with respect to the antigen which is sought,for the kit to be used in a competition dosage process.

The invention also relates to a necessary or kit for the diagnosis ofprior exposure of a subject to M. tuberculosis, with said necessary orkit containing a preparation of at least one of the polypeptides orpeptides of the invention, with said preparation being able to induce invivo, after being intradermally injected to a subject, a delayed-typehypersensitivity reaction at the site of injection, in case the subjecthas had prior exposure to M. tuberculosis.

This necessary or kit is called a skin test.

The invention also relates to an immunogenic composition comprising apolypeptide or a peptide according to the invention, in association witha pharmaceutically acceptable vehicle.

The invention also relates to a vaccine composition comprising amongother immunogenic principles any one of the polypeptides or peptides ofthe invention or the expression product of the invention, possiblycoupled to a natural protein or to a synthetic polypeptide having asufficient molecular weight so that the conjugate is able to induce invivo the production of antibodies neutralizing Mycobacteriumtuberculosis, or induce in vivo a cellular immune response by activatingM. tuberculosis antigen-responsive T cells.

The peptides of the invention which are advantageously used asimmunogenic principle are the ones mentioned in Table 1.

Other characteristics and advantages of the invention will appear in thefollowing examples and the figures illustrating the invention.

MATERIALS AND METHODS

1. Preparation of genomic DNA (Thole J. et al., 1985, "Cloning ofMycobacterium bovis BCG DNA and expression of antigens in Escherichiacoli" Infect. Immun. 50:3800):

M. bovis BCG was cultivated at 37° C. in Sauton medium and harvestedafter an additional incubation of 18 h in the presence of 1% glycineadded at the end of the late exponential growth phase. The bacteria weretreated with lysozyme and proteinase K, lysed with sodium dodecylsulfate, phenol extracted and ethanol precipitated.

2. Genomic libraries:

A λgtll recombinant library constructed from genomic DNA of M.tuberculosis (Erdman strain), was obtained from Young R. A. et al.,1985, "Dissection of Mycobacterium tuberculosis antigens usingrecombinant DNA" Proc. Natl. Acad. Sci. USA 82:2583.

A second λgtll recombinant library was prepared with genomic DNA from M.bovis BCG (De Wit L. et al., 1990, "Nucleotide sequence of the 32kDa-protein gene (antigen 85A) of Mycobacterium bovis BCG" Nucl. Ac.Res. 18:3995).

3. Oligonucleotides:

Oligonucleotides were synthesized on an Applied Biosystems DNAsynthesizer model 381A, purified on OPC-cartridges (Applied Biosystems),lyophilized and dissolved in TE buffer (10 mM Tris-HCl, pH 7,4).

³² p labeling of the oligonucleotides was as described in Sambrook J. etal., 1989, "Molecular Cloning: a Laboratory Manual" Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y.

4. PCR:

50 ng of Mycobacterium bovis BCG DNA was amplified in a 50-μl reactionmixture containing 1 x PCR-buffer (Amersham), 200 μM dNTP, 1 μM each ofsense P78 (5'CCGGAATTCATGGGCCGTGACATCAAG) (SEQ ID NO:33) and antisenseP79 (5'CCGGAATTCGGTCTCCCACTTGTAAGT) (SEQ ID NO:34) oligonucleotideprimers (the location of these two primers is indicated in FIG. 2A. Toboth oligonucleotides were added an EcoRI sequence preceded by 3additional nucleotides), and 2 units of Taq DNA polymerase. Afterdenaturation for 90 seconds at 94° C. the reaction was submitted to 40cycles consisting of 1 minute at 93° C. (denaturation), 90 seconds at55° C.(annealing), 2 minutes at 72° C. (extension), followed by a 5minute final extension at 72° C. After extraction with 150 μlchloroform, the amplified DNA was washed three times with 0.75 ml H₂ Oin a Centricon-30 for 6 minutes at 6500 rpm in the Sorvall SS 34 rotor.After digestion with EcoRI the DNA was ligated into EcoRI-digested,phosphatase-treated Bluescribe-M13+vector. DH5α E. coli (Gibco-BRL) weretransformed and plated on Hybond-N filters. Colonies were selected byhybridization with ³² P-labeled oligonucleotide probe-A(5'-TCGCCCGCCCTGTACCTG) (SEQ ID NO:35) and oligonucleotide probe-B(5'-TCACCTGCGGTTTATCTG) (SEQ ID NO:36). Hybridization and washingconditions for the oligonucleotides were as described by Jacobs et al.(Jacobs et al., 1988, "The thermal stability of oligonucleotide duplexesis sequence independent in tetraalkylammonium salt solutions:application to identifying recombinant DNA clones" Nucl. Acid Res.16:4637).

5. Screening of the λgt11 M. tuberculosis and Mycobacterium bovis BCGrecombinant DNA libraries:

The two λgt11 recombinant libraries were screened by colonyhybridization (Sambrook J. et al., 1989, "Molecular Cloning: aLaboratory Manual" Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y.) with a 800 bp HindIII fragment of the previously cloned gene 85A(Borremans L. et al., 1989, "Cloning, sequence determination andexpression of a 32-kilodalton protein gene of Mycobacteriumtuberculosis" Infect. Immun. 57:3123) which does not discriminate gene85A from 85B (see FIGS. 2A and 4A). Twelve positive M. tuberculosis and12 Mycobacterium bovis BCG plaques were retained and screened byhybridization with ³² P-labeled oligonucleotide-probe C(5'-TCGCAGAGCAACGGCCAGAACTAC) (SEQ ID NO:37) as described above.

From the M. tuberculosis λgt11 library, one selected bacteriophage #11was partially digested with EcoRI and its 5 kbp insert was subcloned inBluescribe-M13+. From this recombinant plasmid named 11-2, a 3,500 bpBamHI-EcoRI fragment was subcloned in M13-mp18 and M13-mp19 (Sambrook J.et al., 1989, "Molecular Cloning: a Laboratory Manual" Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y.).

6. Recombinant DNA analysis:

It was as described in Borremans L. et al., 1989, "Cloning, sequencedetermination and expression of a 32-kilodalton protein gene ofMycobacterium tuberculosis" Infect. Immun. 57:3123.

7. Sequencing:

Sequence analysis was done by the primer extension dideoxy terminationmethod of Sanger et al. (Sanger F. et al., 1977, "DNA sequencing withchain termination inhibitors" Proc. Natl. Acad. Sci. USA 74:5463) aftersubcloning of specific fragments in Bluescribe-M13+(Chen E. J. et al.,1985, "Supercoil sequencing: a fast simple method for sequencing plasmidDNA" DNA 4:165) or in mp18 and mp19 M13 vectors. Sequence analysis wasgreatly hampered by the high GC content of the M. tuberculosis DNA(65%). Sequencing reactions were therefore performed with several DNApolymerases according to manufacturers protocols: T7 DNA polymerase("Sequenase" USB), T7 DNA polymerase (Pharmacia), and Taq DNA polymerase(Promega) using 7-deaza-dGTP instead of dGTP. Severaloligodeoxynucleotides were synthesized and used to focus on ambiguousregions of the sequence. The sequencing strategy is summarized in FIG.1.

8. Sequence comparison and analysis:

Routine computer-aided analysis of the nucleic acid and deduced aminoacid sequences were performed with the LGBC program from Bellon B.,1988, "Apple Macintosh programs for nucleic and protein sequenceanalysis" Nucleic Acid Res. 16:1837. Homology searches used the FASTAprograms from Pearson W. R. et al., 1988, "Improved tools for biologicalsequence comparison" Proc. Natl. Acad. Sci. USA 85:2444, and the variousDNA and protein data bank from the EMBL-server facilities. Multiplealignments were obtained with `Align 1.01` (Scientific and EducationalSoftware).

9. Southern blot analysis:

Genomic DNA from Mycobacterium bovis BCG was completely digested withSphI, EcoRI or KpnI, electrophoresed on a 1% agarose gel, transferred toHybond-N filter (Amersham) after denaturation and neutralization andeither hybridized with ³² P-labeled-oligonucleotide probes (A, B, C) inthe conditions described in Jacobs et al., 1988, "The thermal stabilityof oligonucleotide duplexes is sequence independent intetraalkylammonium salt solutions: application to identifyingrecombinant DNA clones" Nucl. Ac. Res. 16:4637, or random-primed ³²P-labeled DNA restriction fragments that were found to discriminate the3 genes 85A, 85B, and 85C.

Probe 85A was a 230 bp PstI fragment from plasmid BY-5 (Borremans L. etal., 1989, "Cloning, sequence determination and expression of a32-kilodalton protein gene of Mycobacterium tuberculosis" Infect. Immun.57:3123 and FIG. 2A). Probe 85B was a 400 bp SmaI-EcoRV fragment from a85B recombinant plasmid named 5.1, derived from our Mycobacterium bovisBCG λgt11 library, whose map is presented in FIG. 4A (see also FIG. 2A).Probe 85C was a 280 bp SmaI-kpnI fragment from plasmid 11.2 (see alsoFIG. 4A and 2A).

These DNA fragments were prepared by gel electrophoresis on low meltingpoint agarose followed by a rapid purification on Qiagen (marketed by:Westburg, Netherlands) (tip 5) according to manufacturers protocol andlabeled in the presence of α-³² P-dCTP (Feinberg A. P. et al., 1983, "Atechnique for radiolabeling DNA restriction endonuclease fragments tohigh specific activity" Anal. Biochem. 132:6).

10. Pulse Field electrophoresis DNA separation:

DNA preparation, restriction enzyme digestion and pulse-field gelelectrophoresis were performed as described by Vincent Levy-Frebault V.et al., 1990, ("DNA polymorphism in Mycobacterium paratuberculosis,"wood pigeon mycobacteria" and related mycobacteria analyzed by fieldinversion gel electrophoresis", J. Clin. Microbiol. 27:2723). Brieflycells from fresh cultures were mixed with 1% low-melting-point agarose(v/v) and submitted to successive treatments with zymolase (SeikagakiKogyo, Tokyo, Japan), lysozyme, and sodium dodecyl sulfate in thepresence of proteinase K (Boehringer GmbH, Mannheim, Germany). Afterinactivation of proteinase K with phenylmethylsulfonyl fluoride (Bio-RadLaboratories), agarose blocks were digested overnight with 50 U of DraI(Bio-Rad Laboratories). Then blocks were loaded into a 1% agarose gelprepared and electrophoresed in 0.66 TBE (Tris-boric acid--EDTA). Fieldinversion gel electrophoresis was carried out using a Dnastar Pulse(Dnastar, USA) apparatus. Forward and reverses pulses were set at 0.33sec and 0.11 sec at the beginning of the run and 60 sec and 20 sec (or30 sec and 10 sec) at the end of the run depending on the molecularweight zone to be expanded. The run time was set at 36 h, the voltageused was 100 V and producing about 325 mA and temperature was maintainedat 18° C. Lambda concatemers were used as molecular weight markers. Atthe end of the run, the gels were stained with ethidium bromide,photographed under UV light and transferred onto nylon membranesaccording to Maniatis T. et al., 1982, "Molecular cloning: a laboratorymanual" Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. 545 pp.

RESULTS

1. Cloning of the 85C gene of M. tuberculosis:

Since no specific probe or monoclonal antibody was available to detectspecifically an 85C or related antigen which was expected to bearextensive homology to gene 85A and gene 85B, this screening required thedevelopment of a new procedure. The strategy used was based on the PCRamplification of a 245 bp DNA fragment coding for amino acids 18-98 ofthe mature antigen 85A chosen because it is surrounded at both ends byhighly conserved DNA sequences when the sequences of antigen A and B arealigned (see primers P78 and P79 in FIG. 2A). It was thus supposed thatan equivalent homology might exist with the sequence of antigen 85C inthe same region.

From Mycobacterium bovis BCG genomic DNA, a 245 bp DNA fragment wasreadily obtained. The latter was purified and subcloned in a BluescribeM13+vector after digestion with EcoRI. About 80 recombinantplasmid-containing colonies were tested by plating on nylon filters andhybridized under stringent conditions with a labeled syntheticoligonucleotide recognizing either sequence 85A (5'-TCGCCCGCCCTGTACCTG)(SEQ ID NO:35) or sequence 85B (5'-TCACCTGCGGTTTATCTG) (SEQ ID NO:36)within the PCR amplified fragment (see FIG. 2A). Several clones thathybridized with each oligonucleotide probe were sequenced and thesequences were all identical to sequence 85A in the clones hybridizingwith oligoprobe A and to sequence 85B for those hybridizing witholigoprobe B. Several of the remaining clones were sequenced and theyall showed a marked sequence divergence from 85A and 85B covering a24-nucleotide stretch which is totally distinct from sequence A and B(FIG. 2A, box marked C) (The homology to sequence B is only 33% in thisregion). Assuming these inserts might represent an amplified fragment ofthe 85C gene and that this 24 nucleotide sequence is characteristic ofthe putative 85C gene, an oligonucleotide probe (oligo 85C) based onthis sequence was synthesized.

The latter probe was labeled with ³² P and used to screen a collectionof 24 λgt11 recombinant phages that were selected in our M. tuberculosisand Mycobacterium bovis BCG λgt11 libraries by hybridization with a 800bp non-specific HindIII DNA fragment of the previously cloned gene 85A.

One hybridizing λgt11-M. tuberculosis recombinant was retained,characterized by restriction mapping and sequenced.

2. Sequence of the 85C gene of Mycobacterium tuberculosis:

The 1211 nucleotide sequence derived from various sequenced fragments isrepresented in FIG. 1. The DNA sequence contains a 1,020-bp-long openreading frame, starting at position 150 and ending with a TGA codon atposition 1170. The common NH2 terminal amino acid sequence of theantigen 85 proteins, Phe-Ser-Arg-ProGly-Leu (SEQ ID NO:1) (De Bruyn J.et al., 1987, "Purification, partial characterization and identificationof a 32 kDa protein antigen of Mycobacterium bovis BCG" Microb.Pathogen. 2:351) could be located within this open reading frame fromthe nucleotide sequence beginning with a TTC codon at position 288 (FIG.1). Therefore, the DNA region upstream from this sequence is expected tocode for a signal peptide required for the secretion of this antigen.The mature protein consists of 294 amino acid residues corresponding toa calculated molecular weight of 32,021.

Interestingly, the N-terminal sequence of the mature protein containsthe entire 26 amino acid sequence(phe-ser-arg-pro-gly-leu-pro-val-glu-tyr-leugln-val-pro-ser-ala-ser-met-gly-arg-asp-ile-lys-valgln-phe) (SEQ ID NO:38) described by Wiker H. G. et al., 1990, "Evidencefor three separate genes encoding the proteins of the mycobacterialantigen 85 complex" Infect. Immun. 58:272, and which differs only fromthe common 85B and 85A sequence by an alanine instead of a proline inposition 16 of the mature protein. Two ATG codons were found to precedethe TTC phenylalanine codon at nucleotide position 288 (FIG. 1) in thesame reading frame. Use of these two ATG would lead to the synthesis ofsignal peptides of either 21 or 46 amino acid residues (the lattersituation has been represented in FIG. 1 for reasons indicated below).

The base composition of antigen 85C gene was identical to that of the85A gene with an overall G-C composition of 64.57% and a strongpreference for G or C in codon position 3 (average 85%). In contrast toantigen 85A and 85B that contain 3 cysteins, the sequence of antigen 85Cshows a single cystein residue at position 254. In fact, the twosubstituted cysteins are located in the region of the mature 85C proteinwhich contains the largest divergent sequence bloc (FIG. 2B) (SQSNGQNY)(SEQ ID NO:4) (The corresponding DNA sequence was used to synthesize theoligonucleotide probe "C" (see above)). Not surprisingly, thishydrophilic region is also the most divergent when the hydropathy plotsof the 3 antigens are compared and thus could be either a variable"epitope" of all 85-antigens and/or a characteristic epitope of antigen85C since it was also found in antigen 85C from M. bovis BCG (FIG. 2B,fifth line).

Another characteristic feature of antigen 85C is the presence of theunusual hydrophobic repetitive proline alanine motive PPAAPAAPAA (SEQ IDNO:7) at the carboxy-terminal of the molecule.

3. Hydropathy pattern:

The hydropathy pattern of M. tuberculosis 85C antigen was determined bythe method of Kyte and Doolittle (Kyte J. et al., 1982, "Simple methodfor displaying the hydropathy character of a protein" J. Mol. Biol.157:105). The octapeptide profiles were compared to antigen 85A and 85B(FIG. 3). As anticipated from the amino acid sequences, the patterns areroughly similar for the three antigens except for some major differencesat region 84-92 and in the carboxy-terminal part of the three proteins.

4. Sequence homologies:

DNA sequences from antigen 85A (Borremans L. et al., 1989, "Cloning,sequence determination and expression of a 32-kilodalton protein gene ofMycobacterium tuberculosis" Infect. Immun. 57:3123; De Wit L. et al.,1990, "Nucleotide sequence of the 32 kDa-protein gene (antigen 85A) ofMycobacterium bovis BCG" Nucl. Ac. Res. 18:3995), 85B (Matsuo K. et al.,1988, "Cloning and expression of the Mycobacterium bovis BCG gene forextracellular α-antigen" J. Bacteriol. 170:3847; Matsuo et al., 1990,"Cloning and Expression of the gene for cross-reactive α-antigen of M.kansasii" Infect. Immunity 58:550-556) and 85C were aligned. Analignment of the three DNA sequences is shown in FIG. 2A. At the DNAlevel, the homology is maximal between the regions coding for the 3mature proteins. In this region, the homology between A and B is 77.5%whereas it reaches only 70.8% between the coding regions of genes A andC and 71.9% between B and C, respectively. Beyond nucleotide 1369 ofsequence 85A and upstream from nucleotide position 475 (i.e. within thesignal sequence and promoter region) there is practically no homologybetween the 3 sequences. No significant homology was detected to otherDNA sequences present in the latest release of GenBank-EMBL.

Homologies at the amino acid level, are presented in the alignment inFIG. 2B, again indicating a higher homology between sequences A and B(80.4%) than between B/C or A/C.

Other comparisons between the 85C antigen and the entire SwissProt-NBRFdata bank failed to detect any significant homologies to the 85C antigenamino acid sequence. As for the 85A antigen, the 85C sequence does notcontain the RGD motif of fibronectin binding proteins nor does it shareany homology to the known fibronectin receptors or to the fibronectinbinding protein from Staphylococcus aureus.

Comparison of the partial PCR derived DNA sequence of the 85C gene of M.bovis BCG 1173P₂ with that of Mycobacterium tuberculosis shows completeidentity including the characteristic region corresponding to syntheticoligonucleotide C (see FIG. 2A).

5. Genome characterization:

In order to confirm the existence of different genes coding for theantigen 85 complex M. bovis BCG genomic DNA was digested with SphI,EcoRI and KpnI and the distribution of radioactive signals was examinedin Southern blot after hybridization with three specific oligonucleotide(A, B, C) probes (see Materials and Methods and FIG. 2A). Three clearlydistinct patterns were obtained confirming the specificity of theseprobes. Similar type specific profiles could be obtained with threerandom-priming-labeled DNA restriction fragments (probe 85A, 230 bp;85B, 400 bp; 85C 280 bp) which were selected within the promoter signalsequence of the three DNAs (FIGS. 2A and 4A). With these three DNArestriction fragments, additional weak bands are also observed whichclearly correspond to cross hybridization of the probes to the other twogenes. With probe 85C, an additional KpnI fragment was observed thatdoes not hybridize to the C-oligonucleotide probe. This probablyindicates that the corresponding KpnI site is located upstream from thisgene. Furthermore the size of the observed restriction fragments are notalways exactly as expected from the restriction maps of thecorresponding cloned genes. These discrepancies probably correspond tosome minor sequence differences (restriction polymorphism) possibly innon coding DNA regions (outside of the DNA coding for the antigen 85)between strain of M. bovis BCG and the M. bovis BCG (strain Tokyo) andM. tuberculosis respectively.

6. Pulse field analysis of M. tuberculosis genomic DNA:

When the largest available 85A clone BY-5 was hybridized (FIG. 4A) witholigonucleotide probe B, no positive signal was detected whereasoligonucleotide probe A gave a positive hybridization (not shown). Thisindicates that gene B is not located within 2-2.5 kb of the 5' and 4.0kb of the 3' border of gene A (FIG. 4A). To confirm and extend thisresult, pulse-field separated Dral-digested M. tuberculosis genomic DNAwas further hybridized with the three specific DNA restriction fragmentsas probes (85A, 85B and 85C) under stringent conditions.

Eight strains of M. tuberculosis were compared showing six differentpatterns, three of which are illustrated in FIG. 5. For most strainsexamined, the three probes hybridized to fragments of different sizes.For instance, in M. tuberculosis H37Ra, the respective size of the DraIfragments hybridizing with probes 85A, B and C were about 242 kb, 212 kband 225 kb for strain H37Ra, 403 kb, 212 kb and 104 kb for strain H37Rvand 355 kb, 104 kb and 153 kb for strain "11025". Although variousstrains show some restriction fragment length polymorphism withrestriction endonuclease DraI, the simplest interpretation of theseresults is that the three antigen 85 genes are distantly located (>100kb) within the mycobacterial genome.

    __________________________________________________________________________    #             SEQUENCE LISTING    - (1) GENERAL INFORMATION:    -    (iii) NUMBER OF SEQUENCES: 38    - (2) INFORMATION FOR SEQ ID NO:1:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 6 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -    (vii) IMMEDIATE SOURCE:    #amino acid sequence of Antigenal                   85A    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:    - Phe Ser Arg Pro Gly Leu    1               5    - (2) INFORMATION FOR SEQ ID NO:2:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 1211 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Nucleotide o - #f 85C antigen containing region                   M. tuberc - #ulosis    -     (ix) FEATURE:              (A) NAME/KEY: CDS              (B) LOCATION: 150..1169    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:    - AGGTGTCCGG GCCGACGCTG AATCGTTAGC CAACCGCGAT CTCGCGCTGC GG - #CCACGACA      60    - TTCGAACTGA GCGTCCTCGG TGTGTTTCAC TCGCCCAGAA CAGATTCGAC CG - #CGTCGTGC     120    - GCAGATGAGA GTTGGGATTG GTAGTAGCT ATG ACG TTC TTC GAA - # CAG GTG CGA     173    #              Met Thr P - #he Phe Glu Gln Val Arg    #             5  1    - AGG TTG CGG AGC GCA GCG ACA ACC CTG CCG CG - #C CGC GTG GCT ATC GCG     221    Arg Leu Arg Ser Ala Ala Thr Thr Leu Pro Ar - #g Arg Val Ala Ile Ala    #     20    - GCT ATG GGG GCT GTC CTG GTT TAC GGT CTG GT - #C GGT ACC TTC GGC GGG     269    Ala Met Gly Ala Val Leu Val Tyr Gly Leu Va - #l Gly Thr Phe Gly Gly    # 40    - CCG GCC ACC GCG GGC GCA TTC TCT AGG CCC GG - #T CTT CCA GTG GAA TAT     317    Pro Ala Thr Ala Gly Ala Phe Ser Arg Pro Gl - #y Leu Pro Val Glu Tyr    #                 55    - CTG CAG GTG CCA TCC GCG TCG ATG GGC CGC GA - #C ATC AAG GTC CAG TTC     365    Leu Gln Val Pro Ser Ala Ser Met Gly Arg As - #p Ile Lys Val Gln Phe    #             70    - CAG GGC GGC GGA CCG CAC GCG GTC TAC CTG CT - #C GAC GGT CTG CGG GCC     413    Gln Gly Gly Gly Pro His Ala Val Tyr Leu Le - #u Asp Gly Leu Arg Ala    #         85    - CAG GAT GAC TAC AAC GGC TGG GAC ATC AAC AC - #C CCG GCC TTC GAG GAG     461    Gln Asp Asp Tyr Asn Gly Trp Asp Ile Asn Th - #r Pro Ala Phe Glu Glu    #    100    - TAC TAC CAG TCA GGG TTG TCG GTG ATC ATG CC - #C GTG GGC GGC CAA TCC     509    Tyr Tyr Gln Ser Gly Leu Ser Val Ile Met Pr - #o Val Gly Gly Gln Ser    105                 1 - #10                 1 - #15                 1 -    #20    - AGT TTC TAC ACC GAC TGG TAT CAG CCC TCG CA - #G AGC AAC GGC CAG AAC     557    Ser Phe Tyr Thr Asp Trp Tyr Gln Pro Ser Gl - #n Ser Asn Gly Gln Asn    #               135    - TAC ACC TAC AAG TGG GAG ACC TTC CTT ACC AG - #A GAG ATG CCC GCC TGG     605    Tyr Thr Tyr Lys Trp Glu Thr Phe Leu Thr Ar - #g Glu Met Pro Ala Trp    #           150    - CTA CAG GCC AAC AAG GGC GTG TCC CCG ACA GG - #C AAC GCG GCG GTG GGT     653    Leu Gln Ala Asn Lys Gly Val Ser Pro Thr Gl - #y Asn Ala Ala Val Gly    #       165    - CTT TCG ATG TCG GGC GGT TCC GCG CTG ATC CT - #G GCC GCG TAC TAC CCG     701    Leu Ser Met Ser Gly Gly Ser Ala Leu Ile Le - #u Ala Ala Tyr Tyr Pro    #   180    - CAG CAG TTC CCG TAC GCC GCG TCG TTG TCG GG - #C TTC CTC AAC CCG TCC     749    Gln Gln Phe Pro Tyr Ala Ala Ser Leu Ser Gl - #y Phe Leu Asn Pro Ser    185                 1 - #90                 1 - #95                 2 -    #00    - GAG GGC TGG TGG CCG ACG CTG ATC GGC CTG GC - #G ATG AAC GAC TCG GGC     797    Glu Gly Trp Trp Pro Thr Leu Ile Gly Leu Al - #a Met Asn Asp Ser Gly    #               215    - GGT TAC AAC GCC AAC AGC ATG TGG GGT CCG TC - #C AGC GAC CCG GCC TGG     845    Gly Tyr Asn Ala Asn Ser Met Trp Gly Pro Se - #r Ser Asp Pro Ala Trp    #           230    - AAG CGC AAC GAC CCA ATG GTT CAG ATT CCC CG - #C CTG GTC GCC AAC AAC     893    Lys Arg Asn Asp Pro Met Val Gln Ile Pro Ar - #g Leu Val Ala Asn Asn    #       245    - ACC CGG ATC TGG GTG TAC TGC GGT AAC GGC AC - #A CCC AGC GAC CTC GGC     941    Thr Arg Ile Trp Val Tyr Cys Gly Asn Gly Th - #r Pro Ser Asp Leu Gly    #   260    - GGC GAC AAC ATA CCG GCG AAG TTC CTG GAA GG - #C CTC ACC CTG CGC ACC     989    Gly Asp Asn Ile Pro Ala Lys Phe Leu Glu Gl - #y Leu Thr Leu Arg Thr    265                 2 - #70                 2 - #75                 2 -    #80    - AAC CAG ACC TTC CGG GAC ACC TAC GCG GCC GA - #C GGT GGA CGC AAC GGG    1037    Asn Gln Thr Phe Arg Asp Thr Tyr Ala Ala As - #p Gly Gly Arg Asn Gly    #               295    - GTG TTT AAC TTC CCG CCC AAC GGA ACA CAC TC - #G TGG CCC TAC TGG AAC    1085    Val Phe Asn Phe Pro Pro Asn Gly Thr His Se - #r Trp Pro Tyr Trp Asn    #           310    - GAG CAG CTG GTC GCC ATG AAG GCC GAT ATC CA - #G CAT GTG CTC AAC GGC    1133    Glu Gln Leu Val Ala Met Lys Ala Asp Ile Gl - #n His Val Leu Asn Gly    #       325    - GCG ACA CCC CCG GCC GCC CCT GCT GCG CCG GC - #C GCC TGAGCCAGCA    1179    Ala Thr Pro Pro Ala Ala Pro Ala Ala Pro Al - #a Ala    #   340    #        1211      AGCG CAACGGCCAG CG    - (2) INFORMATION FOR SEQ ID NO:3:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 340 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:    - Met Thr Phe Phe Glu Gln Val Arg Arg Leu Ar - #g Ser Ala Ala Thr Thr    #                 15    - Leu Pro Arg Arg Val Ala Ile Ala Ala Met Gl - #y Ala Val Leu Val Tyr    #             30    - Gly Leu Val Gly Thr Phe Gly Gly Pro Ala Th - #r Ala Gly Ala Phe Ser    #         45    - Arg Pro Gly Leu Pro Val Glu Tyr Leu Gln Va - #l Pro Ser Ala Ser Met    #     60    - Gly Arg Asp Ile Lys Val Gln Phe Gln Gly Gl - #y Gly Pro His Ala Val    # 80    - Tyr Leu Leu Asp Gly Leu Arg Ala Gln Asp As - #p Tyr Asn Gly Trp Asp    #                 95    - Ile Asn Thr Pro Ala Phe Glu Glu Tyr Tyr Gl - #n Ser Gly Leu Ser Val    #           110    - Ile Met Pro Val Gly Gly Gln Ser Ser Phe Ty - #r Thr Asp Trp Tyr Gln    #       125    - Pro Ser Gln Ser Asn Gly Gln Asn Tyr Thr Ty - #r Lys Trp Glu Thr Phe    #   140    - Leu Thr Arg Glu Met Pro Ala Trp Leu Gln Al - #a Asn Lys Gly Val Ser    145                 1 - #50                 1 - #55                 1 -    #60    - Pro Thr Gly Asn Ala Ala Val Gly Leu Ser Me - #t Ser Gly Gly Ser Ala    #               175    - Leu Ile Leu Ala Ala Tyr Tyr Pro Gln Gln Ph - #e Pro Tyr Ala Ala Ser    #           190    - Leu Ser Gly Phe Leu Asn Pro Ser Glu Gly Tr - #p Trp Pro Thr Leu Ile    #       205    - Gly Leu Ala Met Asn Asp Ser Gly Gly Tyr As - #n Ala Asn Ser Met Trp    #   220    - Gly Pro Ser Ser Asp Pro Ala Trp Lys Arg As - #n Asp Pro Met Val Gln    225                 2 - #30                 2 - #35                 2 -    #40    - Ile Pro Arg Leu Val Ala Asn Asn Thr Arg Il - #e Trp Val Tyr Cys Gly    #               255    - Asn Gly Thr Pro Ser Asp Leu Gly Gly Asp As - #n Ile Pro Ala Lys Phe    #           270    - Leu Glu Gly Leu Thr Leu Arg Thr Asn Gln Th - #r Phe Arg Asp Thr Tyr    #       285    - Ala Ala Asp Gly Gly Arg Asn Gly Val Phe As - #n Phe Pro Pro Asn Gly    #   300    - Thr His Ser Trp Pro Tyr Trp Asn Glu Gln Le - #u Val Ala Met Lys Ala    305                 3 - #10                 3 - #15                 3 -    #20    - Asp Ile Gln His Val Leu Asn Gly Ala Thr Pr - #o Pro Ala Ala Pro Ala    #               335    - Ala Pro Ala Ala                340    - (2) INFORMATION FOR SEQ ID NO:4:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 8 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:    - Ser Gln Ser Asn Gly Gln Asn Tyr    1               5    - (2) INFORMATION FOR SEQ ID NO:5:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 10 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:    - Pro Met Val Gln Ile Pro Arg Leu Val Ala    #                10    - (2) INFORMATION FOR SEQ ID NO:6:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 22 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:    - Gly Leu Thr Leu Arg Thr Asn Gln Thr Phe Ar - #g Asp Thr Tyr Ala Ala    #                15    - Asp Gly Gly Arg Asn Gly                20    - (2) INFORMATION FOR SEQ ID NO:7:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 10 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:    - Pro Pro Ala Ala Pro Ala Ala Pro Ala Ala    #                10    - (2) INFORMATION FOR SEQ ID NO:8:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 14 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:    - Phe Ser Arg Pro Gly Leu Pro Val Glu Tyr Le - #u Gln Val Pro    #                10    - (2) INFORMATION FOR SEQ ID NO:9:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 24 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:    #                24ACCC CGGC    - (2) INFORMATION FOR SEQ ID NO:10:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 24 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:    #                24GTTG CCAG    - (2) INFORMATION FOR SEQ ID NO:11:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 24 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:    #                24GCTG CCAG    - (2) INFORMATION FOR SEQ ID NO:12:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 24 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:    #                24GCTG CACG    - (2) INFORMATION FOR SEQ ID NO:13:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 8 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:    - Gly Trp Asp Ile Asn Thr Pro Ala    1               5    - (2) INFORMATION FOR SEQ ID NO:14:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 8 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:    - Ala Cys Gly Lys Ala Gly Cys Gln    1               5    - (2) INFORMATION FOR SEQ ID NO:15:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 8 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:    - Ala Cys Gly Lys Ala Gly Cys Thr    1               5    - (2) INFORMATION FOR SEQ ID NO:16:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:    - Asp Gly Leu Arg Ala Gln Asp Asp Tyr Asn Gl - #y Trp Asp Ile Asn Thr    #                15    - Pro Ala Phe Glu                20    - (2) INFORMATION FOR SEQ ID NO:17:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:    - Thr Asp Trp Tyr Gln Pro Ser Gln Ser Asn Gl - #y Gln Asn Tyr Thr Tyr    #                15    - Lys Trp Glu Thr                20    - (2) INFORMATION FOR SEQ ID NO:18:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:    - Ala Asn Ser Met Trp Gly Pro Ser Ser Asp Pr - #o Ala Trp Lys Arg Asn    #                15    - Asp Pro Met Val                20    - (2) INFORMATION FOR SEQ ID NO:19:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:    - Arg Ile Trp Val Tyr Cys Gly Asn Gly Thr Pr - #o Ser Asp Leu Gly Gly    #                15    - Asp Asn Ile Pro                20    - (2) INFORMATION FOR SEQ ID NO:20:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:    - Asn Gln Thr Phe Arg Asp Thr Tyr Ala Ala As - #p Gly Gly Arg Asn Gly    #                15    - Val Phe Asn Phe                20    - (2) INFORMATION FOR SEQ ID NO:21:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:    - Gly Val Phe Asn Phe Pro Pro Asn Gly Thr Hi - #s Ser Trp Pro Tyr Trp    #                15    - Asn Glu Gln Leu                20    - (2) INFORMATION FOR SEQ ID NO:22:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 20 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: peptide    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:    - Asp Ile Gln His Val Leu Asn Gly Ala Thr Pr - #o Pro Ala Ala Pro Ala    #                15    - Ala Pro Ala Ala                20    - (2) INFORMATION FOR SEQ ID NO:23:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 1462 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m tuberculosis    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:    - TGACCGGCAC CGCGATACGT TGCGGCAGGC ATCTGGGCTG GCGGTGGTTC GC - #CGCTCCGA      60    - AGCCGTCGAA CACCATCGCC AGCGCGGCCC GGCCCGCCAC CGGGAGTGAG GG - #GCAATGAG     120    - CGCGGGGGCA ATACTGACAG CAAGATCACA ATTGAGCCGG CACATGCGTC GA - #CACATGCC     180    - CAGACACTGC GGAAATGCCA CCTTCAGGCC GTCGCGTCGG TCCCGAATTG GC - #CGTGAACG     240    - ACCGCCGGAT AAGGGTTTCG GCGGTGCGCT TGATGCGGGT GGACGCCCGA AG - #TTGTGGTT     300    - GACTACACGA GCACTGCCGG GCCCAGCGCC TGCAGTCTGA CCTAATTCAG GA - #TGCGCCCA     360    - AACATGCATG GATGCGTTGA GATGAGGATG AGGGAAGCAA GAATGCAGCT TG - #TTGACAGG     420    - GTTCGTGGCG CCGTCACGGG TATGTCGCGT CGACTCGTGG TCGGGGCCGT CG - #GCGCGGCC     480    - CTAGTGTCGG GTCTGGTCGG CGCCGTCGGT GGCACGGCGA CCGCGGGGGC AT - #TTTCCCGG     540    - CCGGGCTTGC CGGTGGAGTA CCTGCAGGTG CCGTCGCCGT CGATGGGCCG TG - #ACATCAAG     600    - GTCCAATTCC AAAGTGGTGG TGCCAACTCG CCCGCCCTGT ACCTGCTCGA CG - #GCCTGCGC     660    - GCGCAGGACG ACTTCAGCGG CTGGGACATC AACACCCCGG CGTTCGAGTG GT - #ACGACCAG     720    - TCGGGCCTGT CGGTGGTCAT GCCGGTGGGT GGCCAGTCAA GCTTCTACTC CG - #ACTGGTAC     780    - CAGCCCGCCT GCGGCAAGGC CGGTTGCCAG ACTTACAAGT GGGAGACCTT CC - #TGACCAGC     840    - GAGCTGCCGG GGTGGCTGCA GGCCAACAGG CACGTCAAGC CCACCGGAAG CG - #CCGTCGTC     900    - GGTCTTTCGA TGGCTGCTTC TTCGGCGCTG ACGCTGGCGA TCTATCACCC CC - #AGCAGTTC     960    - GTCTACGCGG GAGCGATGTC GGGCCTGTTG GACCCCTCCC AGGCGATGGG TC - #CCACCCTG    1020    - ATCGGCCTGG CGATGGGTGA CGCTGGCGGC TACAAGGCCT CCGACATGTG GG - #GCCCGAAG    1080    - GAGGACCCGG CGTGGCAGCG CAACGACCCG CTGTTGAACG TCGGGAAGCT GA - #TCGCCAAC    1140    - AACACCCGCG TCTGGGTGTA CTGCGGCAAC GGCAAGCCGT CGGATCTGGG TG - #GCAACAAC    1200    - CTGCCGGCCA AGTTCCTCGA GGGCTTCGTG CGGACCAGCA ACATCAAGTT CC - #AAGACGCC    1260    - TACAACGCCG GTGGCGGCCA CAACGGCGTG TTCGACTTCC CGGACAGCGG TA - #CGCACAGC    1320    - TGGGAGTACT GGGGCGCGCA GCTCAACGCT ATGAAGCCCG ACCTGCAACG GG - #CACTGGGT    1380    - GCCACGCCCA ACACCGGGCC CGCGCCCCAG GGCGCCTAGC TCCGAACAGA CA - #CAACATCT    1440    #               1462TGG NN    - (2) INFORMATION FOR SEQ ID NO:24:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 1091 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m bovis    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:    - ACGACTTTCG CCCGAATCGA CATTTGGCCT CCACACACGG TATGTTCTGG CC - #CGAGCACA      60    - CGACGACATA CAGGACAAAG GGGCACAGGT ATGACAGACG TGAGCCGAAA GA - #TTCGAGCT     120    - TGGGGACGCC GATTGATGAT CGGCACGGCA GCGGCTGTAG TCCTTCCGGG CC - #TGGTGGGG     180    - CTTGCCGGCG GAGCGGCAAC CGCGGGCGCG TTCTCCCGGC CGGGGCTGCC GG - #TCGAGTAC     240    - CTGCAGGTGC CGTCGCCGTC GATGGGCCGC GACATCAAGG TTCAGTTCCA GA - #GCGGTGGG     300    - AACAACTCAC CTGCGGTTTA TCTGCTCGAC GGCCTGCGCG CCCAAGACGA CT - #ACAACGGC     360    - TGGGATATCA ACACCCCGGC GTTCGAGTGG TACTACCAGT CGGGACTGTC GA - #TAGTCATG     420    - CCGGTCGGCG GGCAGTCCAG CTTCTACAGC GACTGGTACA GCCCGGCCTG CG - #GTAAGGCT     480    - GGCTGCCAGA CTTACAAGTG GGAAACCCTC CTGACCAGCG AGCTGCCGCA AT - #GGTTGTCC     540    - GCCAACAGGG CCGTGAAGCC CACCGGCAGC GCTGCAATCG GCTTGTCGAT GG - #CCGGCTCG     600    - TCGGCAATGA TCTTGGCCGC CTACCACCCC CAGCAGTTCA TCTACGCCGG CT - #CGCTGTCG     660    - GCCCTGCTGG ACCCCTCTCA GGGGATGGGC CTGATCGGCC TCGCGATGGG TG - #ACGCCGGC     720    - GGTTACAAGG CCGCAGACAT GTGGGGTCCC TCGAGTGACC CGGCATGGGA GC - #GCAACGAC     780    - CCTACGCAGC AGATCCCCAA GCTGGTCGCA AACAACACCC GGCTATGGGT TT - #ATTGCGGG     840    - AACGGCACCC CGAACGAGTT GGGCGGTGCC AACATACCCG CCGAGTTCTT GG - #AGAACTTC     900    - GTTCGTAGCA GCAACCTGAA GTTCCAGGAT GCGTACAAGC CCGCGGGCGG GC - #ACAACGCC     960    - GTGTTCAACT TCCCGCCCAA CGGCACGCAC AGCTGGGAGT ACTGGGGCGC TC - #AGCTCAAC    1020    - GCCATGAAGG GTGACCTGCA GAGTTCGTTA GGCGCCGGCT GACGGGATCA AC - #CGAAGGTT    1080    #     1091    - (2) INFORMATION FOR SEQ ID NO:25:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 1335 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m kansasii    -    (vii) IMMEDIATE SOURCE:    #from M. kansasiiNE: Antigen 85B    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:    - GTTAACTATT CTTTGTACCG CTCCCCGCCT GCCGCCTTCT GCCCTGCTCC GG - #GTGCATAG      60    - CACCCGTTTG CGCTCCGGAT TATCCGGGCC GCAACGGGGC AACGGGGGAA GC - #GGGTGAGT     120    - CCGTCGCCGA CTCGCATAGC ACCGTTGCTG TGTTGGCGGG GGTAACCGAT AT - #CGAAATGG     180    - AATGACTTCG CGTCCCGATC GACATTTGCC CTACTCACAC GGTAAGTTCT GC - #CGGGAGCA     240    - CGCGAGCACA TACGGACAAG GGGCAGGGTA TGACAGACGT GAGCGGGAAG AT - #TCGGGCGT     300    - GGGGCCGACG CCTTCTGGTC GGCGCGGCCG CTGCTGCGGC CCTTCCTGGC CT - #GGTCGGAC     360    - TCGCCGGCGG AGCGGCGACC GCGGGAGCGT TCTCCCGTCC CGGCCTGCCG GT - #GGAGTACC     420    - TCCAGGTGCC GTCGGCTGCG ATGGGTCGCA GTATCAAGGT TCAATTCCAA AG - #TGGCGGGG     480    - ACAACTCGCC GGCGGTGTAC CTGCTCGACG GTCTCCGCGC TCAAGACGAC TA - #CAACGGCT     540    - GGGACATCAA CACCCCGGCC TTCGAGTGGT ACTACCAATC GGGCCTGTCG GT - #CATCATGC     600    - CGGTCGGCGG ACAGTCCAGT TTCTACAGTG ACTGGTACAG CCCGGCCTGC GG - #CAAGGCCG     660    - GCTGCACGAC CTACAAGTGG GAGACCTTCC TGACCAGCGA GCTGCCGCAA TG - #GCTGTCCG     720    - CGAACCGGAG TGTCAAGCCC ACCGGAAGCG CCGCGGTCGG CATCTCGATG GC - #CGGCTTGT     780    - CGGCCCTGAT CCTGTCCGTC TACCACCCGC AGCAGTTCAT CTACGCGGGT TC - #GTTGTCGG     840    - CCCTGATGGA CCCCTCCCAG GGGATGGGGC CGTCTCTGAT CGGCTTGGCG AT - #GGGTGACG     900    - CCGGTGGTTA CAAGGCCTCG GACATGTGGG GACCCTCGAG TGACCCAGCC TG - #GCAGCGTA     960    - ACGACCCGTC GCTGCACATT CCGGAGCTGG TCGCCAACAA CACCCGCCTG TG - #GATCTACT    1020    - GCGGCAACGG CACCCCGTCC GAGTTGGGCG GTGCCAATGT TCCGGCCGAA TT - #CCTGGAGA    1080    - ACTTCGTTCG CAGCAGCAAC CTGAAATTCC AGGACGCCTA CAACGCCGCG GG - #CGGGCGGC    1140    - CACAACGCCG TGTTCAATTT GGACGCCAAC GGAACGCACA GCTGGGAGTA CT - #GGGGCGCG    1200    - CAGCTCAACG CCATGAAGGG TGACCTGCAG GCCAGCCTGG GCGCCCGCTG AT - #CGCGCAAC    1260    - GGTTGCCGCT ACTGGGCTTG ACGGCAAGAC GCCGTCAAGC CAGTAGTGTG TT - #CGGCACCT    1320    #  1335    - (2) INFORMATION FOR SEQ ID NO:26:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 1178 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m tuberculosis    -    (vii) IMMEDIATE SOURCE:    #from M. tuberculosisAntigen 85C    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:    - AGGTGTCCGG GCCGACGCTG AATCGTTAGC CAACCGCGAT CTCGCGCTGC GG - #CCACGACA      60    - TTCGAACTGA GCGTCCTCGG TGTGTTTCAC TCGCCCAGAA CAGATTCGAC CG - #CGTCGTGC     120    - GCAGATGAGA GTTGGGATTG GTAGTAGCTA TGACGTTCTT CGAACAGGTG CG - #AAGGTTGC     180    - GGAGCGCAGC GACAACCCTG CCGCGCCGCG TGGCTATCGC GGCTATGGGG GC - #TGTCCTGG     240    - TTTACGGTCT GGTCGGTACC TTCGGCGGGC CGGCCACCGC GGGCGCATTC TC - #TAGGCCCG     300    - GTCTTCCAGT GGAATATCTG CAGGTGCCAT CCGCGTCGAT GGGCCGCGAC AT - #CAAGGTCC     360    - AGTTCCAGGG CGGCGGACCG CACGCGGTCT ACCTGCTCGA CGGTCTGCGG GC - #CCAGGATG     420    - ACTACAACGG CTGGGACATC AACACCCCGG CCTTCGAGGA GTACTACCAG TC - #AGGGTTGT     480    - CGGTGATCAT GCCCGTGGGC GGCCAATCCA GTTTCTACAC CGACTGGTAT CA - #GCCCTCGC     540    - AGAGCAACGG CCAGAACTAC ACCTACAAGT GGGAGACCTT CCTTACCAGA GA - #GATGCCCG     600    - CCTGGCTACA GGCCAACAAG GGCGTGTCCC CGACAGGCAA CGCGGCGGTG GG - #TCTTTCGA     660    - TGTCGGGCGG TTCCGCGCTG ATCCTGGCCG CGTACTACCC GCAGCAGTTC CC - #GTACGCCG     720    - CGTCGTTGTC GGGCTTCCTC AACCCGTCCG AGGGCTGGTG GCCGACGCTG AT - #CGGCCTGG     780    - CGATGAACGA CTCGGGCGGT TACAACGCCA ACAGCATGTG GGGTCCGTCC AG - #CGACCCGG     840    - CCTGGAAGCG CAACGACCCA ATGGTTCAGA TTCCCCGCCT GGTCGCCAAC AA - #CACCCGGA     900    - TCTGGGTGTA CTGCGGTAAC GGCACACCCA GCGACCTCGG CGGCGACAAC AT - #ACCGGCGA     960    - AGTTCCTGGA AGGCCTCACC CTGCGCACCA ACCAGACCTT CCGGGACACC TA - #CGCGGCCG    1020    - ACGGTGGACG CAACGGGGTG TTTAACTTCC CGCCCAACGG AACACACTCG TG - #GCCCTACT    1080    - GGAACGAGCA GCTGGTCGCC ATGAAGGCCG ATATCCAGCA TGTGCTCAAC GG - #CGCGACAC    1140    #   1178           TGCG CCGGCCGCCT GAGCCAGC    - (2) INFORMATION FOR SEQ ID NO:27:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 185 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m bovis    -    (vii) IMMEDIATE SOURCE:    #sequence from M. bovis BCG strain                   1173P2    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:    - CTGCTCGACG GTCTGCGGGC CCAGGATGAC TACAACGGCT GGGACATCAA CA - #CCCCGGCC      60    - TTCGAGGAGT ACTACCAGTC AGGGTTGTCG GTGATCATGC CCGTGGGCGG CC - #AATCCAGT     120    - TTCTACACCG ACTGGTATCA GCCCTCGCAG AGCAACGGCC AGAACTACAC TT - #ACAAGTGG     180    #           185    - (2) INFORMATION FOR SEQ ID NO:28:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 338 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m tuberculosis    -    (vii) IMMEDIATE SOURCE:    #protein sequence from M.gen 85A                   tuberculosis    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:    - Met Gln Leu Val Asp Arg Val Arg Gly Ala Va - #l Thr Gly Met Ser Arg    #                15    - Arg Leu Val Val Gly Ala Val Gly Ala Ala Le - #u Val Ser Gly Leu Val    #            30    - Gly Ala Val Gly Gly Thr Ala Thr Ala Gly Al - #a Phe Ser Arg Pro Gly    #        45    - Leu Pro Val Glu Tyr Leu Gln Val Pro Ser Pr - #o Ser Met Gly Arg Asp    #    60    - Ile Lys Val Gln Phe Gln Ser Gly Gly Ala As - #n Ser Pro Ala Leu Tyr    #80    - Leu Leu Asp Gly Leu Arg Ala Gln Asp Asp Ph - #e Ser Gly Trp Asp Ile    #                95    - Asn Thr Pro Ala Phe Glu Trp Tyr Asp Gln Se - #r Gly Leu Ser Val Val    #           110    - Met Pro Val Gly Gly Gln Ser Ser Phe Tyr Se - #r Asp Trp Tyr Gln Pro    #       125    - Ala Cys Gly Lys Ala Gly Cys Gln Thr Tyr Ly - #s Trp Glu Thr Phe Leu    #   140    - Thr Ser Glu Leu Pro Gly Trp Leu Gln Ala As - #n Arg His Val Lys Pro    145                 1 - #50                 1 - #55                 1 -    #60    - Thr Gly Ser Ala Val Val Gly Leu Ser Met Al - #a Ala Ser Ser Ala Leu    #               175    - Thr Leu Ala Ile Tyr His Pro Gln Gln Phe Va - #l Tyr Ala Gly Ala Met    #           190    - Ser Gly Leu Leu Asp Pro Ser Gln Ala Met Gl - #y Pro Thr Leu Ile Gly    #       205    - Leu Ala Met Gly Asp Ala Gly Gly Tyr Lys Al - #a Ser Asp Met Trp Gly    #   220    - Pro Lys Glu Asp Pro Ala Trp Gln Arg Asn As - #p Pro Leu Leu Asn Val    225                 2 - #30                 2 - #35                 2 -    #40    - Gly Lys Leu Ile Ala Asn Asn Thr Arg Val Tr - #p Val Tyr Cys Gly Asn    #               255    - Gly Lys Pro Ser Asp Leu Gly Gly Asn Asn Le - #u Pro Ala Lys Phe Leu    #           270    - Glu Gly Phe Val Arg Thr Ser Asn Ile Lys Ph - #e Gln Asp Ala Tyr Asn    #       285    - Ala Gly Gly Gly His Asn Gly Val Phe Asp Ph - #e Pro Asp Ser Gly Thr    #   300    - His Ser Trp Glu Tyr Trp Gly Ala Gln Leu As - #n Ala Met Lys Pro Asp    305                 3 - #10                 3 - #15                 3 -    #20    - Leu Gln Arg Ala Leu Gly Ala Thr Pro Asn Th - #r Gly Pro Ala Pro Gln    #               335    - Gly Ala    - (2) INFORMATION FOR SEQ ID NO:29:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 325 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m bovis    -    (vii) IMMEDIATE SOURCE:    #protein sequence fromntigen 85B                   alpha-antige - #n of M.bovis    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:    - Met Thr Asp Val Ser Arg Lys Ile Arg Ala Tr - #p Gly Arg Arg Leu Met    #                15    - Ile Gly Thr Ala Ala Ala Val Val Leu Pro Gl - #y Leu Val Gly Leu Ala    #            30    - Gly Gly Ala Ala Thr Ala Gly Ala Phe Ser Ar - #g Pro Gly Leu Pro Val    #        45    - Glu Tyr Leu Gln Val Pro Ser Pro Ser Met Gl - #y Arg Asp Ile Lys Val    #    60    - Gln Phe Gln Ser Gly Gly Asn Asn Ser Pro Al - #a Val Tyr Leu Leu Asp    #80    - Gly Leu Arg Ala Gln Asp Asp Tyr Asn Gly Tr - #p Asp Ile Asn Thr Pro    #                95    - Ala Phe Glu Trp Tyr Tyr Gln Ser Gly Leu Se - #r Ile Val Met Pro Val    #           110    - Gly Gly Gln Ser Ser Phe Tyr Ser Asp Trp Ty - #r Ser Pro Ala Cys Gly    #       125    - Lys Ala Gly Cys Gln Thr Tyr Lys Trp Glu Th - #r Leu Leu Thr Ser Glu    #   140    - Leu Pro Gln Trp Leu Ser Ala Asn Arg Ala Va - #l Lys Pro Thr Gly Ser    145                 1 - #50                 1 - #55                 1 -    #60    - Ala Ala Ile Gly Leu Ser Met Ala Gly Ser Se - #r Ala Met Ile Leu Ala    #               175    - Ala Tyr His Pro Gln Gln Phe Ile Tyr Ala Gl - #y Ser Leu Ser Ala Leu    #           190    - Leu Asp Pro Ser Gln Gly Met Gly Pro Ser Le - #u Ile Gly Leu Ala Met    #       205    - Gly Asp Ala Gly Gly Tyr Lys Ala Ala Asp Me - #t Trp Gly Pro Ser Ser    #   220    - Asp Pro Ala Trp Glu Arg Asn Asp Pro Thr Gl - #n Gln Ile Pro Lys Leu    225                 2 - #30                 2 - #35                 2 -    #40    - Val Ala Asn Asn Thr Arg Leu Trp Val Tyr Cy - #s Gly Asn Gly Thr Pro    #               255    - Asn Glu Leu Gly Gly Ala Asn Ile Pro Ala Gl - #u Phe Leu Glu Asn Phe    #           270    - Val Arg Ser Ser Asn Leu Lys Phe Gln Asp Al - #a Tyr Lys Pro Ala Gly    #       285    - Gly His Asn Ala Val Phe Asn Phe Pro Pro As - #n Gly Thr His Ser Trp    #   300    - Glu Tyr Trp Gly Ala Gln Leu Asn Ala Met Ly - #s Gly Asp Leu Gln Ser    305                 3 - #10                 3 - #15                 3 -    #20    - Ser Leu Gly Ala Gly                    325    - (2) INFORMATION FOR SEQ ID NO:30:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 325 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m kansasii    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Partial prot - #ein sequence from antigen 85B                   from M.ka - #nsasii    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:    - Met Thr Asp Val Ser Gly Lys Ile Arg Ala Tr - #p Gly Arg Arg Leu Leu    #                15    - Val Gly Ala Ala Ala Ala Ala Ala Leu Pro Gl - #y Leu Val Gly Leu Ala    #            30    - Gly Gly Ala Ala Thr Ala Gly Ala Phe Ser Ar - #g Pro Gly Leu Pro Val    #        45    - Glu Tyr Leu Gln Val Pro Ser Ala Ala Met Gl - #y Arg Ser Ile Lys Val    #    60    - Gln Phe Gln Ser Gly Gly Asp Asn Ser Pro Al - #a Val Tyr Leu Leu Asp    #80    - Gly Leu Arg Ala Gln Asp Asp Tyr Asn Gly Tr - #p Asp Ile Asn Thr Pro    #                95    - Ala Phe Glu Trp Tyr Tyr Gln Ser Gly Leu Se - #r Val Ile Met Pro Val    #           110    - Gly Gly Gln Ser Ser Phe Tyr Ser Asp Trp Ty - #r Ser Pro Ala Cys Gly    #       125    - Lys Ala Gly Cys Thr Thr Tyr Lys Trp Glu Th - #r Phe Leu Thr Ser Glu    #   140    - Leu Pro Gln Trp Leu Ser Ala Asn Arg Ser Va - #l Lys Pro Thr Gly Ser    145                 1 - #50                 1 - #55                 1 -    #60    - Ala Ala Val Gly Ile Ser Met Ala Gly Ser Se - #r Ala Leu Ile Leu Ser    #               175    - Val Tyr His Pro Gln Gln Phe Ile Tyr Ala Gl - #y Ser Leu Ser Ala Leu    #           190    - Met Asp Pro Ser Gln Gly Met Gly Pro Ser Le - #u Ile Gly Leu Ala Met    #       205    - Gly Asp Ala Gly Gly Tyr Lys Ala Ser Asp Me - #t Trp Gly Pro Ser Ser    #   220    - Asp Pro Ala Trp Gln Arg Asn Asp Pro Ser Le - #u His Ile Pro Glu Leu    225                 2 - #30                 2 - #35                 2 -    #40    - Val Ala Asn Asn Thr Arg Leu Trp Ile Tyr Cy - #s Gly Asn Gly Thr Pro    #               255    - Ser Glu Leu Gly Gly Ala Asn Val Pro Ala Gl - #u Phe Leu Glu Asn Phe    #           270    - Val Arg Ser Ser Asn Leu Lys Phe Gln Asp Al - #a Tyr Asn Ala Ala Gly    #       285    - Gly His Asn Ala Val Phe Asn Leu Asp Ala As - #n Gly Thr His Ser Trp    #   300    - Glu Tyr Trp Gly Ala Gln Leu Asn Ala Met Ly - #s Gly Asp Leu Gln Ala    305                 3 - #10                 3 - #15                 3 -    #20    - Ser Leu Gly Ala Arg                    325    - (2) INFORMATION FOR SEQ ID NO:31:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 340 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m tuberculosis    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Protein sequ - #ence from antigen 85C from M.                   tuberculosis    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:    - Met Thr Phe Phe Glu Gln Val Arg Arg Leu Ar - #g Ser Ala Ala Thr Thr    #                15    - Leu Pro Arg Arg Val Ala Ile Ala Ala Met Gl - #y Ala Val Leu Val Tyr    #            30    - Gly Leu Val Gly Thr Phe Gly Gly Pro Ala Th - #r Ala Gly Ala Phe Ser    #        45    - Arg Pro Gly Leu Pro Val Glu Tyr Leu Gln Va - #l Pro Ser Ala Ser Met    #    60    - Gly Arg Asp Ile Lys Val Gln Phe Gln Gly Gl - #y Gly Pro His Ala Val    #80    - Tyr Leu Leu Asp Gly Leu Arg Ala Gln Asp As - #p Tyr Asn Gly Trp Asp    #                95    - Ile Asn Thr Pro Ala Phe Glu Glu Tyr Tyr Gl - #n Ser Gly Leu Ser Val    #           110    - Ile Met Pro Val Gly Gly Gln Ser Ser Phe Ty - #r Thr Asp Trp Tyr Gln    #       125    - Pro Ser Gln Ser Asn Gly Gln Asn Tyr Thr Ty - #r Lys Trp Glu Thr Phe    #   140    - Leu Thr Arg Glu Met Pro Ala Trp Leu Gln Al - #a Asn Lys Gly Val Ser    145                 1 - #50                 1 - #55                 1 -    #60    - Pro Thr Gly Asn Ala Ala Val Gly Leu Ser Me - #t Ser Gly Gly Ser Ala    #               175    - Leu Ile Leu Ala Ala Tyr Tyr Pro Gln Gln Ph - #e Pro Tyr Ala Ala Ser    #           190    - Leu Ser Gly Phe Leu Asn Pro Ser Glu Gly Tr - #p Trp Pro Thr Leu Ile    #       205    - Gly Leu Ala Met Asn Asp Ser Gly Gly Tyr As - #n Ala Asn Ser Met Trp    #   220    - Gly Pro Ser Ser Asp Pro Ala Trp Lys Arg As - #n Asp Pro Met Val Gln    225                 2 - #30                 2 - #35                 2 -    #40    - Ile Pro Arg Leu Val Ala Asn Asn Thr Arg Il - #e Trp Val Tyr Cys Gly    #               255    - Asn Gly Thr Pro Ser Asp Leu Gly Gly Asp As - #n Ile Pro Ala Lys Phe    #           270    - Leu Glu Gly Leu Thr Leu Arg Thr Asn Gln Th - #r Phe Arg Asp Thr Tyr    #       285    - Ala Ala Asp Gly Gly Arg Asn Gly Val Phe As - #n Phe Pro Pro Asn Gly    #   300    - Thr His Ser Trp Pro Tyr Trp Asn Glu Gln Le - #u Val Ala Met Lys Ala    305                 3 - #10                 3 - #15                 3 -    #20    - Asp Ile Gln His Val Leu Asn Gly Ala Thr Pr - #o Pro Ala Ala Pro Ala    #               335    - Ala Pro Ala Ala                340    - (2) INFORMATION FOR SEQ ID NO:32:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 57 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (vi) ORIGINAL SOURCE:              (A) ORGANISM: Mycobacteriu - #m bovis    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Partial sequ - #ence from M. bovis BCG strain                   1173P2    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:    - Tyr Leu Leu Asp Gly Leu Arg Ala Gln Asp As - #p Tyr Asn Gly Trp Asp    #                15    - Ile Asn Thr Pro Ala Phe Glu Glu Tyr Tyr Gl - #n Ser Gly Leu Ser Val    #            30    - Ile Met Pro Val Gly Gly Gln Ser Ser Phe Ty - #r Thr Asp Trp Tyr Gln    #        45    - Pro Ser Gln Ser Asn Gly Gln Asn Tyr    #    55    - (2) INFORMATION FOR SEQ ID NO:33:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 27 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:    #oligonucleotide primernse P78    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:    #             27   GTGA CATCAAG    - (2) INFORMATION FOR SEQ ID NO:34:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 27 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Antisense P7 - #9 oligonucleotide primer    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:    #             27   CACT TGTAAGT    - (2) INFORMATION FOR SEQ ID NO:35:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 18 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:    #oligonucleotide probe-A-labeled    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:    #  18              TG    - (2) INFORMATION FOR SEQ ID NO:36:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 18 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:              (B) CLONE: Oligonucleotide - # probe-B    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:    #  18              TG    - (2) INFORMATION FOR SEQ ID NO:37:    -      (i) SEQUENCE CHARACTERISTICS:    #pairs    (A) LENGTH: 24 base              (B) TYPE: nucleic acid              (C) STRANDEDNESS: single              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: DNA (genomic)    -    (vii) IMMEDIATE SOURCE:    #oligonucleotide probe-C-labeled    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:    #                24AGAA CTAC    - (2) INFORMATION FOR SEQ ID NO:38:    -      (i) SEQUENCE CHARACTERISTICS:    #acids    (A) LENGTH: 26 amino              (B) TYPE: amino acid              (D) TOPOLOGY: linear    -     (ii) MOLECULE TYPE: protein    -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:    - Phe Ser Arg Pro Gly Leu Pro Val Glu Tyr Le - #u Gln Val Pro Ser Ala    #                15    - Ser Met Gly Arg Asp Ile Lys Val Gln Phe    #            25    __________________________________________________________________________

We claim:
 1. An isolated nucleic acid segment comprising a nucleotidesequence selected from the group consisting of:a nucleotide sequencecomprising nucleotides from the nucleotide position (1) to thenucleotide position (149) in SEQ ID NO:2; a nucleotide sequence codingfor a polypeptide extending from the amino acid at position 1 to theamino acid at position 46 in SEQ ID NO: 3; a nucleotide sequence codingfor a polypeptide extending from the amino acid at position 26 to theamino acid at position 46 in SEQ ID NO:3; a nucleotide sequence codingfor peptide SQSNGQNY (SEQ ID NO:4); a nucleotide sequence coding forpeptide PMVQIPRLVA (SEQ ID NO:5); a nucleotide sequence coding forpeptide GLTRTNQTFRDTYAADGGRNG (SEQ ID NO:6); a nucleotide sequencecoding for peptide PPAAPAAPAA (SEQ ID NO:7); a nucleotide sequence fullycomplementary to the nucleotide sequences listed above; and a nucleotidesequence corresponding to the nucleotide sequences listed above in whichnucleotide T is replaced by nucleotide U.
 2. The isolated nucleic acidaccording to claim 1, comprising a nucleic acid coding for a peptideselected from the group consisting of:a peptide extending from the aminoacid at position 1 to the amino acid at position (-1) 46 in SEQ ID NO:3;a peptide extending from the amino acid at position 26 to the amino acidat position 1 in SEQ ID NO:3; a peptide extending from the amino acid atposition 1 to the amino acid at position 340 in SEQ ID NO:3; a peptideextending from the amino acid at position 26 to the amino acid atposition 340 in SEQ ID NO:3; and a peptide extending from the amino acidat 47 to the amino acid at position 340 in SEQ ID NO:3.
 3. The isolatednucleic acid segment according to claim 1, encoding for a protein,wherein said protein:reacts selectively with human sera fromtuberculosis patients, is recognized by antibodies which recognize theamino acid sequence extending from the amino acid at position 47 to theamino acid at position 340 in SEQ ID NO:3; or generates antibodies whichrecognize the amino acid sequence extending from the amino acid atposition position 47 to the amino acid at position 340 in SEQ ID NO:3.4. The isolated nucleic acid segment according to claim 1 encoding for amature protein of about 30 to about 35 kDa and containing a sequence fora signal peptide.
 5. A recombinant vector comprising:a vector sequence,wherein said vector sequence is selected from the group consisting of aplasmid, a cosmid, a phage DNA, and a virus DNA; and a nucleic acidsegment according to claim 1, inserted in one of the non-essential sitesfor replication of said vector sequence.
 6. A recombinant vectorcomprising in one of its non-essential sites for replication elementsnecessary to promote the expression of the polypeptide or peptideencoded by the nucleic acid segment according to claim 1 in a cellularhost and also a promoter recognized by the RNA polymerase of thecellular host.
 7. The recombinant vector according to claim 5 comprisingelements enabling expression by E. coli of the polypeptide or peptide.8. A Mycobacterium bovis BCG vaccine strain transformed with the DNAsegment according to claim 1, wherein said nucleic segment furthercomprises a nucleotide acid sequence encoding a foreign epitope.
 9. Anisolated host cell which is transformed by a recombinant vectoraccording to claim
 6. 10. The isolated host cell according to claim 9,wherein said isolated host cell is a prokaryote or an cukaryoticorganism.
 11. An isolated nucleic acid segment comprising a nucleotidesequence selected from the following nucleotide sequences:a nucleotidesequence extending from the nucleotide at position (150) to thenucleotide at position (287) in SEQ ID NO:2; a nucleotide sequenceextending from the nucleotide at position (224) to the nucleotide atposition (287) in SEQ ID NO:2; a nucleotide sequence extending from thenucleotide at position (537) to the nucleotide at position (560) in SEQID NO:2; a nucleotide sequence extending from the nucleotide at position(858) to the nucleotide at position (887) in SEQ ID NO:2; a nucleotidesequence extending from the nucleotide at position (972) to thenucleotide at position (1037) in SEQ ID NO:2; a nucleotide sequenceextending from the nucleotide at position (1140) to the nucleotide atposition (1169) in SEQ ID NO:2; a nucleotide sequence fullycomplementary to the nucleotide sequences listed above; and a nucleotidesequence corresponding to the nucleotide sequences listed above in whichnucleotide T is replaced by nucleotide U.